Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-08-31 Thread Marc Kleine-Budde
On 08/29/2015 12:42 AM, Aurelien Jarno wrote:
> On 2015-07-17 12:13, Marc Kleine-Budde wrote:
>> Ping.
>>
>> Any progress on this?
> 
> It would be nice to have the upstream opinion about that, or even get
> this patch merged.

It boils down to upstream not taking the patch unless I fix the whole
mess about __ASSUME_REQUEUE_PI on all architectures, which is broken
since it was introduced. And there is an alternative implementation for
the mutex on the horizon, which doesn't have that issue.

Marc




signature.asc
Description: OpenPGP digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-08-28 Thread Aurelien Jarno
On 2015-07-17 12:13, Marc Kleine-Budde wrote:
 Ping.
 
 Any progress on this?

It would be nice to have the upstream opinion about that, or even get
this patch merged.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


signature.asc
Description: Digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-07-17 Thread Marc Kleine-Budde
Ping.

Any progress on this?

Marc

On 06/26/2015 06:20 PM, Marc Kleine-Budde wrote:
 From eebb9e9abd3405fd72b7e7527132b605e406e83e Mon Sep 17 00:00:00 2001
 From: Marc Kleine-Budde m...@pengutronix.de
 Date: Sat, 13 Jun 2015 19:25:07 +0200
 Subject: [PATCH] ARM: fix PI futex breakge - glibc bug 18463
 
 This patch fixes glibc bug 18463:
 
 https://sourceware.org/bugzilla/show_bug.cgi?id=18463
 
 The problem is caused by:
 
 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 
 9894).
 
 Signed-off-by: Marc Kleine-Budde m...@pengutronix.de
 ---
  sysdeps/unix/sysv/linux/arm/kernel-features.h | 1 -
  1 file changed, 1 deletion(-)
 
 diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h 
 b/sysdeps/unix/sysv/linux/arm/kernel-features.h
 index e755741de60b..0d9f8910d650 100644
 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h
 +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h
 @@ -38,5 +38,4 @@
 futex_atomic_cmpxchg_inatomic, depending on kernel
 configuration.  */
  #undef __ASSUME_FUTEX_LOCK_PI
 -#undef __ASSUME_REQUEUE_PI
  #undef __ASSUME_SET_ROBUST_LIST
 
 


-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-26 Thread Marc Kleine-Budde
From eebb9e9abd3405fd72b7e7527132b605e406e83e Mon Sep 17 00:00:00 2001
From: Marc Kleine-Budde m...@pengutronix.de
Date: Sat, 13 Jun 2015 19:25:07 +0200
Subject: [PATCH] ARM: fix PI futex breakge - glibc bug 18463

This patch fixes glibc bug 18463:

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

The problem is caused by:

47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894).

Signed-off-by: Marc Kleine-Budde m...@pengutronix.de
---
 sysdeps/unix/sysv/linux/arm/kernel-features.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h 
b/sysdeps/unix/sysv/linux/arm/kernel-features.h
index e755741de60b..0d9f8910d650 100644
--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h
@@ -38,5 +38,4 @@
futex_atomic_cmpxchg_inatomic, depending on kernel
configuration.  */
 #undef __ASSUME_FUTEX_LOCK_PI
-#undef __ASSUME_REQUEUE_PI
 #undef __ASSUME_SET_ROBUST_LIST


-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-16 Thread Marc Kleine-Budde
On 06/15/2015 12:59 PM, Aurelien Jarno wrote:
 This means glibc-2.20 doesn't support PI/robust mutex on ARM at all,
 glibc-2.21 needs --enable-kernel=3.14.3.
 
 That's not correct. The __ASSUME_* defines tells that the feature is
 there unconditionally. When they are not defined, the code should probe
 for the feature before using it, and if it is not present, fallback to
 a compatibility code. I guess there is a bug in the probe or the
 compatibility code, and it's what we want to fix.

We had another, deeper look at the glibc and kernel code.

For PI and robust mutexes there is indeed runtime detection, but no
fallback code. But that's not possible without kernel support. Instead a
proper error code is returned:

For example PI mutex:

 int
 __pthread_mutex_init (mutex, mutexattr)
  pthread_mutex_t *mutex;
  const pthread_mutexattr_t *mutexattr;
 {
...
 case PTHREAD_PRIO_INHERIT  PTHREAD_MUTEXATTR_PROTOCOL_SHIFT:
   if (__glibc_unlikely (prio_inherit_missing ()))
 return ENOTSUP;
   break;

and prio_inherit_missing() is:

 static bool
 prio_inherit_missing (void)
 {
 #ifdef __NR_futex
 # ifndef __ASSUME_FUTEX_LOCK_PI
   static int tpi_supported;
   if (__glibc_unlikely (tpi_supported == 0))
 {
   int lock = 0;
   INTERNAL_SYSCALL_DECL (err);
   int ret = INTERNAL_SYSCALL (futex, err, 4, lock, FUTEX_UNLOCK_PI, 0, 
 0);

This is the runtime detection, if the syscall return ENOSYS, the kernel
has no support for PI mutexes.

   assert (INTERNAL_SYSCALL_ERROR_P (ret, err));
   tpi_supported = INTERNAL_SYSCALL_ERRNO (ret, err) == ENOSYS ? -1 : 1;
 }
   return __glibc_unlikely (tpi_supported  0);
 # endif
   return false;
 #endif
   return true;
 }

So far so good. We have runtime detection for FUTEX_LOCK_PI and
SET_ROBUST_LIST (implemented likewise), but the problem is REQUEUE_PI.

The way PI mutexes are implemented you _have_ to use REQUEUE_PI, quoting
the Kernel's ./Documentation/futex-requeue-pi.txt:

 In order to ensure the rt_mutex has an owner if it has waiters, it
 is necessary for both the requeue code, as well as the waiting code,
 to be able to acquire the rt_mutex before returning to user space.
 The requeue code cannot simply wake the waiter and leave it to
 acquire the rt_mutex as it would open a race window between the
  ^^^
 requeue call returning to user space and the waiter waking and
 starting to run.  This is especially true in the uncontended case.
 
 The solution involves two new rt_mutex helper routines,
 rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which
 allow the requeue code to acquire an uncontended rt_mutex on behalf
 of the waiter and to enqueue the waiter on a contended rt_mutex.
 Two new system calls provide the kernel-user interface to
 requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI.
  ^
 
 FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait()
  ^
 and pthread_cond_timedwait()) to block on the initial futex and wait
 to be requeued to a PI-aware futex.  The implementation is the
 result of a high-speed collision between futex_wait() and
 futex_lock_pi(), with some extra logic to check for the additional
 wake-up scenarios.

Looking at pthread_cond_wait:

 int
 __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
 {
[...]
 #if (defined lll_futex_wait_requeue_pi \
   defined __ASSUME_REQUEUE_PI)
  ^^^
   /* If pi_flag remained 1 then it means that we had the lock and the 
 mutex 

  but a spurious waker raced ahead of us.  Give back the mutex before  
   

  going into wait again.  */
   if (pi_flag)
 {
   __pthread_mutex_cond_lock_adjust (mutex);
   __pthread_mutex_unlock_usercnt (mutex, 0);
 }
   pi_flag = USE_REQUEUE_PI (mutex);
 
   if (pi_flag)
 {
   err = lll_futex_wait_requeue_pi (cond-__data.__futex,
futex_val, mutex-__data.__lock,
pshared);

Here's the requeue_pi syscall, but no runtime check for it. See
__ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this
code is not compiled in, resulting in the breakage we're seeing.

 
   pi_flag = (err == 0);
 }
   else
 #endif
   /* Wait until woken by signal or broadcast.  */
 lll_futex_wait (cond-__data.__futex, futex_val, pshared);


Looking a the kernel's kernel/futex.c:

 long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
 u32 __user *uaddr2, u32 val2, u32 val3)
 {
...
   

Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-15 Thread Marc Kleine-Budde
On 06/15/2015 11:49 AM, Aurelien Jarno wrote:
 Given that stable has kernel 3.16 the probably easiest solution would be
 to bump the dependency to = 3.16?!
 
 No, it's not a solution, just a workaround. For stretch we will switch
 the minimum kernel to 3.2, that is the one from wheezy. We have always
 done such a way so that people can use chroots with different
 distributions, and we still have complaints that we don't allow old
 enough kernel versions.
 
 The solution would be to backport the upstream fix, but I can't find it
 in current git, nor in the bug report. Could you please point it to me?

This bug was caused by glibc commit (during glibc-2.20 development):

47c5adebd2c8 Correct robust mutex /
 PI futex kernel assumptions (bug 9894).

Instead of applying your proposed patch
(https://sourceware.org/bugzilla/attachment.cgi?id=3778action=diff) to
enable the PI and robust mutex on kernels = 2.6.28, PI and robust
mutexes were completely disabled for ARM.

Later, during the glibc-2.21 development this commit:

03d41216fe09 arm: Re-enable PI futex support for
 ARM kernels = 3.14.3

was added. So you need --enable-kernel=3.14.3 for glibc-2.21 for PI and
robust mutexes.

This means glibc-2.20 doesn't support PI/robust mutex on ARM at all,
glibc-2.21 needs --enable-kernel=3.14.3.

Indeed there is no upstream fix if you want PI/robust mutexes on
glibc-2.21/ARM/--enable-kernel=3.2

Marc



signature.asc
Description: OpenPGP digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-15 Thread Aurelien Jarno
On 2015-06-15 12:36, Marc Kleine-Budde wrote:
 On 06/15/2015 11:49 AM, Aurelien Jarno wrote:
  Given that stable has kernel 3.16 the probably easiest solution would be
  to bump the dependency to = 3.16?!
  
  No, it's not a solution, just a workaround. For stretch we will switch
  the minimum kernel to 3.2, that is the one from wheezy. We have always
  done such a way so that people can use chroots with different
  distributions, and we still have complaints that we don't allow old
  enough kernel versions.
  
  The solution would be to backport the upstream fix, but I can't find it
  in current git, nor in the bug report. Could you please point it to me?
 
 This bug was caused by glibc commit (during glibc-2.20 development):
 
 47c5adebd2c8 Correct robust mutex /
  PI futex kernel assumptions (bug 9894).
 
 Instead of applying your proposed patch
 (https://sourceware.org/bugzilla/attachment.cgi?id=3778action=diff) to
 enable the PI and robust mutex on kernels = 2.6.28, PI and robust
 mutexes were completely disabled for ARM.
 
 Later, during the glibc-2.21 development this commit:
 
 03d41216fe09 arm: Re-enable PI futex support for
  ARM kernels = 3.14.3
 
 was added. So you need --enable-kernel=3.14.3 for glibc-2.21 for PI and
 robust mutexes.
 
 This means glibc-2.20 doesn't support PI/robust mutex on ARM at all,
 glibc-2.21 needs --enable-kernel=3.14.3.

That's not correct. The __ASSUME_* defines tells that the feature is
there unconditionally. When they are not defined, the code should probe
for the feature before using it, and if it is not present, fallback to
a compatibility code. I guess there is a bug in the probe or the
compatibility code, and it's what we want to fix.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


signature.asc
Description: Digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-15 Thread Marc Kleine-Budde
On 06/15/2015 12:59 PM, Aurelien Jarno wrote:
 This means glibc-2.20 doesn't support PI/robust mutex on ARM at all,
 glibc-2.21 needs --enable-kernel=3.14.3.
 
 That's not correct.

I was describing the situation as it is, with an unpatched glibc.

tested on kernel-4.0:
Debian experimental glibc-2.21 -- BAD
glibc-2.20 --enable-kernel-3.0 -- BAD
glibc-2.20 + HACK[1]--enable-kernel-3.0 -- GOOD
glibc-2.21 --enable-kernel-3.0 -- BAD
glibc-2.21 --enable-kernel-3.15 -- GOOD

 The __ASSUME_* defines tells that the feature is
 there unconditionally. When they are not defined, the code should probe
 for the feature before using it, and if it is not present, fallback to
 a compatibility code. I guess there is a bug in the probe or the
 compatibility code, and it's what we want to fix.

Sounds reasonable, I tried to reopen the glibc bug 9894 that caused this
regression but I had no luck
(https://sourceware.org/bugzilla/show_bug.cgi?id=9894#c20)

Marc

[1]
index e755741de60b..a56290ce4ca4 100644
--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h
@@ -37,6 +37,6 @@
 /* The ARM kernel may or may not support
futex_atomic_cmpxchg_inatomic, depending on kernel
configuration.  */
-#undef __ASSUME_FUTEX_LOCK_PI
-#undef __ASSUME_REQUEUE_PI
-#undef __ASSUME_SET_ROBUST_LIST
+//#undef __ASSUME_FUTEX_LOCK_PI
+//#undef __ASSUME_REQUEUE_PI
+//#undef __ASSUME_SET_ROBUST_LIST



signature.asc
Description: OpenPGP digital signature


Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-15 Thread Aurelien Jarno
On 2015-06-15 10:07, Uwe Kleine-König wrote:
 Package: libc6
 Version: 2.21-0experimental0
 Severity: normal
 Tags: upstream fixed-upstream
 Forwarded: https://sourceware.org/bugzilla/show_bug.cgi?id=18463
 
 Hello,
 
 the attached program fails with an error on ARM platforms. (I only tested
 armhf, but I think armel is affected, too.)
 
 The problem is that pthread_mutex_unlock() in the main thread returns
 EPERM which it shouldn't. This only happens when using
 pthread_cond_broadcast, at least two threads and PTHREAD_PRIO_INHERIT.
 
 For more details refer to the upstream bug[1].
 
 A possible solution documented in the upstream bug report is to pass
 --enable-kernel=3.15 (or something bigger). 
 According to
 https://buildd.debian.org/status/fetch.php?pkg=glibcarch=armhfver=2.21-0experimental0stamp=1426842906
 glibc is build using --enable-kernel=2.6.32, I guess that's from the
 setting MIN_KERNEL_SUPPORTED in debian/sysdeps/linux.mk.
 
 Given that stable has kernel 3.16 the probably easiest solution would be
 to bump the dependency to = 3.16?!

No, it's not a solution, just a workaround. For stretch we will switch
the minimum kernel to 3.2, that is the one from wheezy. We have always
done such a way so that people can use chroots with different
distributions, and we still have complaints that we don't allow old
enough kernel versions.

The solution would be to backport the upstream fix, but I can't find it
in current git, nor in the bug report. Could you please point it to me?

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


--
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150615094936.ga4...@aurel32.net



Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-15 Thread Uwe Kleine-König
Package: libc6
Version: 2.21-0experimental0
Severity: normal
Tags: upstream fixed-upstream
Forwarded: https://sourceware.org/bugzilla/show_bug.cgi?id=18463

Hello,

the attached program fails with an error on ARM platforms. (I only tested
armhf, but I think armel is affected, too.)

The problem is that pthread_mutex_unlock() in the main thread returns
EPERM which it shouldn't. This only happens when using
pthread_cond_broadcast, at least two threads and PTHREAD_PRIO_INHERIT.

For more details refer to the upstream bug[1].

A possible solution documented in the upstream bug report is to pass
--enable-kernel=3.15 (or something bigger). 
According to
https://buildd.debian.org/status/fetch.php?pkg=glibcarch=armhfver=2.21-0experimental0stamp=1426842906
glibc is build using --enable-kernel=2.6.32, I guess that's from the
setting MIN_KERNEL_SUPPORTED in debian/sysdeps/linux.mk.

Given that stable has kernel 3.16 the probably easiest solution would be
to bump the dependency to = 3.16?!

Best regards
Uwe

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=18463

-- System Information:
Debian Release: 8.1
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'unstable'), (500, 'testing'), (1, 
'experimental')
Architecture: armhf (armv7l)

Kernel: Linux 3.16.0-4-armmp (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages libc6 depends on:
ii  libgcc1  1:4.9.2-10

libc6 recommends no packages.

Versions of packages libc6 suggests:
ii  debconf [debconf-2.0]  1.5.56
pn  glibc-doc  none
ii  locales2.21-0experimental0

-- debconf information excluded


-- 
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/20150615080703.834.97516.report...@crater.defre.kleine-koenig.org



Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

2015-06-15 Thread Marc Kleine-Budde
 the attached program fails with an error on ARM platforms. (I only tested
 armhf, but I think armel is affected, too.)

armel is affected, too.

regards,
Marc



signature.asc
Description: OpenPGP digital signature