Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 08/29/2015 12:42 AM, Aurelien Jarno wrote: > On 2015-07-17 12:13, Marc Kleine-Budde wrote: >> Ping. >> >> Any progress on this? > > It would be nice to have the upstream opinion about that, or even get > this patch merged. It boils down to upstream not taking the patch unless I fix the whole mess about __ASSUME_REQUEUE_PI on all architectures, which is broken since it was introduced. And there is an alternative implementation for the mutex on the horizon, which doesn't have that issue. Marc signature.asc Description: OpenPGP digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 2015-07-17 12:13, Marc Kleine-Budde wrote: Ping. Any progress on this? It would be nice to have the upstream opinion about that, or even get this patch merged. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net signature.asc Description: Digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
Ping. Any progress on this? Marc On 06/26/2015 06:20 PM, Marc Kleine-Budde wrote: From eebb9e9abd3405fd72b7e7527132b605e406e83e Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde m...@pengutronix.de Date: Sat, 13 Jun 2015 19:25:07 +0200 Subject: [PATCH] ARM: fix PI futex breakge - glibc bug 18463 This patch fixes glibc bug 18463: https://sourceware.org/bugzilla/show_bug.cgi?id=18463 The problem is caused by: 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). Signed-off-by: Marc Kleine-Budde m...@pengutronix.de --- sysdeps/unix/sysv/linux/arm/kernel-features.h | 1 - 1 file changed, 1 deletion(-) diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h b/sysdeps/unix/sysv/linux/arm/kernel-features.h index e755741de60b..0d9f8910d650 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -38,5 +38,4 @@ futex_atomic_cmpxchg_inatomic, depending on kernel configuration. */ #undef __ASSUME_FUTEX_LOCK_PI -#undef __ASSUME_REQUEUE_PI #undef __ASSUME_SET_ROBUST_LIST -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
From eebb9e9abd3405fd72b7e7527132b605e406e83e Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde m...@pengutronix.de Date: Sat, 13 Jun 2015 19:25:07 +0200 Subject: [PATCH] ARM: fix PI futex breakge - glibc bug 18463 This patch fixes glibc bug 18463: https://sourceware.org/bugzilla/show_bug.cgi?id=18463 The problem is caused by: 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). Signed-off-by: Marc Kleine-Budde m...@pengutronix.de --- sysdeps/unix/sysv/linux/arm/kernel-features.h | 1 - 1 file changed, 1 deletion(-) diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h b/sysdeps/unix/sysv/linux/arm/kernel-features.h index e755741de60b..0d9f8910d650 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -38,5 +38,4 @@ futex_atomic_cmpxchg_inatomic, depending on kernel configuration. */ #undef __ASSUME_FUTEX_LOCK_PI -#undef __ASSUME_REQUEUE_PI #undef __ASSUME_SET_ROBUST_LIST -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 06/15/2015 12:59 PM, Aurelien Jarno wrote: This means glibc-2.20 doesn't support PI/robust mutex on ARM at all, glibc-2.21 needs --enable-kernel=3.14.3. That's not correct. The __ASSUME_* defines tells that the feature is there unconditionally. When they are not defined, the code should probe for the feature before using it, and if it is not present, fallback to a compatibility code. I guess there is a bug in the probe or the compatibility code, and it's what we want to fix. We had another, deeper look at the glibc and kernel code. For PI and robust mutexes there is indeed runtime detection, but no fallback code. But that's not possible without kernel support. Instead a proper error code is returned: For example PI mutex: int __pthread_mutex_init (mutex, mutexattr) pthread_mutex_t *mutex; const pthread_mutexattr_t *mutexattr; { ... case PTHREAD_PRIO_INHERIT PTHREAD_MUTEXATTR_PROTOCOL_SHIFT: if (__glibc_unlikely (prio_inherit_missing ())) return ENOTSUP; break; and prio_inherit_missing() is: static bool prio_inherit_missing (void) { #ifdef __NR_futex # ifndef __ASSUME_FUTEX_LOCK_PI static int tpi_supported; if (__glibc_unlikely (tpi_supported == 0)) { int lock = 0; INTERNAL_SYSCALL_DECL (err); int ret = INTERNAL_SYSCALL (futex, err, 4, lock, FUTEX_UNLOCK_PI, 0, 0); This is the runtime detection, if the syscall return ENOSYS, the kernel has no support for PI mutexes. assert (INTERNAL_SYSCALL_ERROR_P (ret, err)); tpi_supported = INTERNAL_SYSCALL_ERRNO (ret, err) == ENOSYS ? -1 : 1; } return __glibc_unlikely (tpi_supported 0); # endif return false; #endif return true; } So far so good. We have runtime detection for FUTEX_LOCK_PI and SET_ROBUST_LIST (implemented likewise), but the problem is REQUEUE_PI. The way PI mutexes are implemented you _have_ to use REQUEUE_PI, quoting the Kernel's ./Documentation/futex-requeue-pi.txt: In order to ensure the rt_mutex has an owner if it has waiters, it is necessary for both the requeue code, as well as the waiting code, to be able to acquire the rt_mutex before returning to user space. The requeue code cannot simply wake the waiter and leave it to acquire the rt_mutex as it would open a race window between the ^^^ requeue call returning to user space and the waiter waking and starting to run. This is especially true in the uncontended case. The solution involves two new rt_mutex helper routines, rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which allow the requeue code to acquire an uncontended rt_mutex on behalf of the waiter and to enqueue the waiter on a contended rt_mutex. Two new system calls provide the kernel-user interface to requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI. ^ FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait() ^ and pthread_cond_timedwait()) to block on the initial futex and wait to be requeued to a PI-aware futex. The implementation is the result of a high-speed collision between futex_wait() and futex_lock_pi(), with some extra logic to check for the additional wake-up scenarios. Looking at pthread_cond_wait: int __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex) { [...] #if (defined lll_futex_wait_requeue_pi \ defined __ASSUME_REQUEUE_PI) ^^^ /* If pi_flag remained 1 then it means that we had the lock and the mutex but a spurious waker raced ahead of us. Give back the mutex before going into wait again. */ if (pi_flag) { __pthread_mutex_cond_lock_adjust (mutex); __pthread_mutex_unlock_usercnt (mutex, 0); } pi_flag = USE_REQUEUE_PI (mutex); if (pi_flag) { err = lll_futex_wait_requeue_pi (cond-__data.__futex, futex_val, mutex-__data.__lock, pshared); Here's the requeue_pi syscall, but no runtime check for it. See __ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this code is not compiled in, resulting in the breakage we're seeing. pi_flag = (err == 0); } else #endif /* Wait until woken by signal or broadcast. */ lll_futex_wait (cond-__data.__futex, futex_val, pshared); Looking a the kernel's kernel/futex.c: long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) { ...
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 06/15/2015 11:49 AM, Aurelien Jarno wrote: Given that stable has kernel 3.16 the probably easiest solution would be to bump the dependency to = 3.16?! No, it's not a solution, just a workaround. For stretch we will switch the minimum kernel to 3.2, that is the one from wheezy. We have always done such a way so that people can use chroots with different distributions, and we still have complaints that we don't allow old enough kernel versions. The solution would be to backport the upstream fix, but I can't find it in current git, nor in the bug report. Could you please point it to me? This bug was caused by glibc commit (during glibc-2.20 development): 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). Instead of applying your proposed patch (https://sourceware.org/bugzilla/attachment.cgi?id=3778action=diff) to enable the PI and robust mutex on kernels = 2.6.28, PI and robust mutexes were completely disabled for ARM. Later, during the glibc-2.21 development this commit: 03d41216fe09 arm: Re-enable PI futex support for ARM kernels = 3.14.3 was added. So you need --enable-kernel=3.14.3 for glibc-2.21 for PI and robust mutexes. This means glibc-2.20 doesn't support PI/robust mutex on ARM at all, glibc-2.21 needs --enable-kernel=3.14.3. Indeed there is no upstream fix if you want PI/robust mutexes on glibc-2.21/ARM/--enable-kernel=3.2 Marc signature.asc Description: OpenPGP digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 2015-06-15 12:36, Marc Kleine-Budde wrote: On 06/15/2015 11:49 AM, Aurelien Jarno wrote: Given that stable has kernel 3.16 the probably easiest solution would be to bump the dependency to = 3.16?! No, it's not a solution, just a workaround. For stretch we will switch the minimum kernel to 3.2, that is the one from wheezy. We have always done such a way so that people can use chroots with different distributions, and we still have complaints that we don't allow old enough kernel versions. The solution would be to backport the upstream fix, but I can't find it in current git, nor in the bug report. Could you please point it to me? This bug was caused by glibc commit (during glibc-2.20 development): 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). Instead of applying your proposed patch (https://sourceware.org/bugzilla/attachment.cgi?id=3778action=diff) to enable the PI and robust mutex on kernels = 2.6.28, PI and robust mutexes were completely disabled for ARM. Later, during the glibc-2.21 development this commit: 03d41216fe09 arm: Re-enable PI futex support for ARM kernels = 3.14.3 was added. So you need --enable-kernel=3.14.3 for glibc-2.21 for PI and robust mutexes. This means glibc-2.20 doesn't support PI/robust mutex on ARM at all, glibc-2.21 needs --enable-kernel=3.14.3. That's not correct. The __ASSUME_* defines tells that the feature is there unconditionally. When they are not defined, the code should probe for the feature before using it, and if it is not present, fallback to a compatibility code. I guess there is a bug in the probe or the compatibility code, and it's what we want to fix. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net signature.asc Description: Digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 06/15/2015 12:59 PM, Aurelien Jarno wrote: This means glibc-2.20 doesn't support PI/robust mutex on ARM at all, glibc-2.21 needs --enable-kernel=3.14.3. That's not correct. I was describing the situation as it is, with an unpatched glibc. tested on kernel-4.0: Debian experimental glibc-2.21 -- BAD glibc-2.20 --enable-kernel-3.0 -- BAD glibc-2.20 + HACK[1]--enable-kernel-3.0 -- GOOD glibc-2.21 --enable-kernel-3.0 -- BAD glibc-2.21 --enable-kernel-3.15 -- GOOD The __ASSUME_* defines tells that the feature is there unconditionally. When they are not defined, the code should probe for the feature before using it, and if it is not present, fallback to a compatibility code. I guess there is a bug in the probe or the compatibility code, and it's what we want to fix. Sounds reasonable, I tried to reopen the glibc bug 9894 that caused this regression but I had no luck (https://sourceware.org/bugzilla/show_bug.cgi?id=9894#c20) Marc [1] index e755741de60b..a56290ce4ca4 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -37,6 +37,6 @@ /* The ARM kernel may or may not support futex_atomic_cmpxchg_inatomic, depending on kernel configuration. */ -#undef __ASSUME_FUTEX_LOCK_PI -#undef __ASSUME_REQUEUE_PI -#undef __ASSUME_SET_ROBUST_LIST +//#undef __ASSUME_FUTEX_LOCK_PI +//#undef __ASSUME_REQUEUE_PI +//#undef __ASSUME_SET_ROBUST_LIST signature.asc Description: OpenPGP digital signature
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
On 2015-06-15 10:07, Uwe Kleine-König wrote: Package: libc6 Version: 2.21-0experimental0 Severity: normal Tags: upstream fixed-upstream Forwarded: https://sourceware.org/bugzilla/show_bug.cgi?id=18463 Hello, the attached program fails with an error on ARM platforms. (I only tested armhf, but I think armel is affected, too.) The problem is that pthread_mutex_unlock() in the main thread returns EPERM which it shouldn't. This only happens when using pthread_cond_broadcast, at least two threads and PTHREAD_PRIO_INHERIT. For more details refer to the upstream bug[1]. A possible solution documented in the upstream bug report is to pass --enable-kernel=3.15 (or something bigger). According to https://buildd.debian.org/status/fetch.php?pkg=glibcarch=armhfver=2.21-0experimental0stamp=1426842906 glibc is build using --enable-kernel=2.6.32, I guess that's from the setting MIN_KERNEL_SUPPORTED in debian/sysdeps/linux.mk. Given that stable has kernel 3.16 the probably easiest solution would be to bump the dependency to = 3.16?! No, it's not a solution, just a workaround. For stretch we will switch the minimum kernel to 3.2, that is the one from wheezy. We have always done such a way so that people can use chroots with different distributions, and we still have complaints that we don't allow old enough kernel versions. The solution would be to backport the upstream fix, but I can't find it in current git, nor in the bug report. Could you please point it to me? -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150615094936.ga4...@aurel32.net
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
Package: libc6 Version: 2.21-0experimental0 Severity: normal Tags: upstream fixed-upstream Forwarded: https://sourceware.org/bugzilla/show_bug.cgi?id=18463 Hello, the attached program fails with an error on ARM platforms. (I only tested armhf, but I think armel is affected, too.) The problem is that pthread_mutex_unlock() in the main thread returns EPERM which it shouldn't. This only happens when using pthread_cond_broadcast, at least two threads and PTHREAD_PRIO_INHERIT. For more details refer to the upstream bug[1]. A possible solution documented in the upstream bug report is to pass --enable-kernel=3.15 (or something bigger). According to https://buildd.debian.org/status/fetch.php?pkg=glibcarch=armhfver=2.21-0experimental0stamp=1426842906 glibc is build using --enable-kernel=2.6.32, I guess that's from the setting MIN_KERNEL_SUPPORTED in debian/sysdeps/linux.mk. Given that stable has kernel 3.16 the probably easiest solution would be to bump the dependency to = 3.16?! Best regards Uwe [1] https://sourceware.org/bugzilla/show_bug.cgi?id=18463 -- System Information: Debian Release: 8.1 APT prefers stable APT policy: (990, 'stable'), (500, 'unstable'), (500, 'testing'), (1, 'experimental') Architecture: armhf (armv7l) Kernel: Linux 3.16.0-4-armmp (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages libc6 depends on: ii libgcc1 1:4.9.2-10 libc6 recommends no packages. Versions of packages libc6 suggests: ii debconf [debconf-2.0] 1.5.56 pn glibc-doc none ii locales2.21-0experimental0 -- debconf information excluded -- To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150615080703.834.97516.report...@crater.defre.kleine-koenig.org
Bug#788799: libc6: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
the attached program fails with an error on ARM platforms. (I only tested armhf, but I think armel is affected, too.) armel is affected, too. regards, Marc signature.asc Description: OpenPGP digital signature