Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Fri, Jun 04, 2010 at 11:25:28AM +0200, Aurelien Jarno wrote: Aurelien Jarno a écrit : On Thu, Jun 03, 2010 at 10:09:45PM +0300, Rémi Denis-Courmont wrote: Le jeudi 3 juin 2010 22:00:13 Aurelien Jarno, vous avez écrit : I have found a machine with almost the same CPU, the only difference being the speed (3.00 GHz instead of 2.80 GHz). I am unable to reproduce the problem, I have run the testcase more than 20 times over last night. With SMT (HyperThread) support? Yes, with HyperThreading enabled. Maybe the problem is actually not in the GNU libc. What kernel are you running? Normally, I use upstream 2.6.32.15 at the moment. But I also hit the bug with Debian 2.6.32-5-686. I tried on a 2.6.26 kernel, I'll try to reproduce it with this kernel. I tried on a 2.6.32-5-686 kernel, and it hasn't failed in more than 30 loops. There is probably something different on your system causing the issue. I have modified a bit the testcase so that it runs in a loop, and I removed all timing functions (see attached file). I am able to reproduce the problem in some conditions: - It fails between 20 and 3 millions of iterations on dual-core i386 CPU in lenny, squeeze and sid. - It never fails on HT CPU (tried P4 and Atom) - It never fails when pinned on a single CPU using taskset - It never fails on amd64 - It fails in lenny, testing and unstable - It seems to fail more quickly in a KVM instance (probably more timing variation). This seems to confirm there is a race condition, but very difficult to reproduce. My guess is that a P4 CPU running at 2.8 GHz with HT enabled has the perfect timing to reproduce the bug. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Sat, Jun 05, 2010 at 06:39:24PM +0200, Aurelien Jarno wrote: On Fri, Jun 04, 2010 at 11:25:28AM +0200, Aurelien Jarno wrote: Aurelien Jarno a écrit : On Thu, Jun 03, 2010 at 10:09:45PM +0300, Rémi Denis-Courmont wrote: Le jeudi 3 juin 2010 22:00:13 Aurelien Jarno, vous avez écrit : I have found a machine with almost the same CPU, the only difference being the speed (3.00 GHz instead of 2.80 GHz). I am unable to reproduce the problem, I have run the testcase more than 20 times over last night. With SMT (HyperThread) support? Yes, with HyperThreading enabled. Maybe the problem is actually not in the GNU libc. What kernel are you running? Normally, I use upstream 2.6.32.15 at the moment. But I also hit the bug with Debian 2.6.32-5-686. I tried on a 2.6.26 kernel, I'll try to reproduce it with this kernel. I tried on a 2.6.32-5-686 kernel, and it hasn't failed in more than 30 loops. There is probably something different on your system causing the issue. I have modified a bit the testcase so that it runs in a loop, and I removed all timing functions (see attached file). I am able to reproduce the problem in some conditions: This time it is. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net /* gcc -O2 -Wall -lpthread condfail.c */ #define _GNU_SOURCE 1 #undef NDEBUG #include pthread.h #include time.h #include assert.h #include stdio.h static pthread_cond_t wait = PTHREAD_COND_INITIALIZER; static pthread_mutex_t lock = PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP; static long long int i=0; static void cleanup_lock(void *lock) { int val; i++; val = pthread_mutex_unlock(lock); if (val != 0) { printf(failed after %lli iterations\n, i); assert (0); } } static void *entry(void *barrier) { pthread_mutex_lock(lock); pthread_cleanup_push(cleanup_lock, lock); pthread_barrier_wait(barrier); for (;;) pthread_cond_wait(wait, lock); pthread_cleanup_pop(0); assert(0); } int main (void) { for(;;) { pthread_t th; pthread_barrier_t barrier; pthread_barrier_init(barrier, NULL, 2); pthread_create(th, NULL, entry, barrier); pthread_barrier_wait(barrier); pthread_barrier_destroy(barrier); pthread_cancel(th); pthread_mutex_lock(lock); pthread_mutex_unlock(lock); pthread_join(th, NULL); } return 0; }
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Aurelien Jarno a écrit : On Thu, Jun 03, 2010 at 10:09:45PM +0300, Rémi Denis-Courmont wrote: Le jeudi 3 juin 2010 22:00:13 Aurelien Jarno, vous avez écrit : I have found a machine with almost the same CPU, the only difference being the speed (3.00 GHz instead of 2.80 GHz). I am unable to reproduce the problem, I have run the testcase more than 20 times over last night. With SMT (HyperThread) support? Yes, with HyperThreading enabled. Maybe the problem is actually not in the GNU libc. What kernel are you running? Normally, I use upstream 2.6.32.15 at the moment. But I also hit the bug with Debian 2.6.32-5-686. I tried on a 2.6.26 kernel, I'll try to reproduce it with this kernel. I tried on a 2.6.32-5-686 kernel, and it hasn't failed in more than 30 loops. There is probably something different on your system causing the issue. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le jeudi 3 juin 2010 00:32:14 Aurelien Jarno, vous avez écrit : Does it mean it's a lot more difficult to reproduce it with this version? Today the test case failed 3 out of 3 times already. My VLC debug builds started triggering pthread_mutex_unlock() errors pseudo-randomly again. I did not observe this behaviour since you had presumably fixed the bug. Not a single occurence in those many months. Have you tried to run so many iterations with the version built with gcc-4.3? The test case, not that I remember. VLC debug builds, yes. % cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping: 4 cpu MHz : 2800.000 cache size : 1024 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pebs bts pni dtes64 monitor ds_cpl cid xtpr bogomips: 5585.95 clflush size: 64 cache_alignment : 128 address sizes : 36 bits physical, 32 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping: 4 cpu MHz : 2800.000 cache size : 1024 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pebs bts pni dtes64 monitor ds_cpl cid xtpr bogomips: 5586.01 clflush size: 64 cache_alignment : 128 address sizes : 36 bits physical, 32 bits virtual power management: -- Rémi Denis-Courmont http://www.remlab.net/ http://fi.linkedin.com/in/remidenis -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Thu, Jun 03, 2010 at 08:59:22PM +0300, Rémi Denis-Courmont wrote: Le jeudi 3 juin 2010 00:32:14 Aurelien Jarno, vous avez écrit : Does it mean it's a lot more difficult to reproduce it with this version? Today the test case failed 3 out of 3 times already. My VLC debug builds started triggering pthread_mutex_unlock() errors pseudo-randomly again. I did not observe this behaviour since you had presumably fixed the bug. Not a single occurence in those many months. Have you tried to run so many iterations with the version built with gcc-4.3? The test case, not that I remember. VLC debug builds, yes. % cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping: 4 I have found a machine with almost the same CPU, the only difference being the speed (3.00 GHz instead of 2.80 GHz). I am unable to reproduce the problem, I have run the testcase more than 20 times over last night. Maybe the problem is actually not in the GNU libc. What kernel are you running? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le jeudi 3 juin 2010 22:00:13 Aurelien Jarno, vous avez écrit : I have found a machine with almost the same CPU, the only difference being the speed (3.00 GHz instead of 2.80 GHz). I am unable to reproduce the problem, I have run the testcase more than 20 times over last night. With SMT (HyperThread) support? Maybe the problem is actually not in the GNU libc. What kernel are you running? Normally, I use upstream 2.6.32.15 at the moment. But I also hit the bug with Debian 2.6.32-5-686. -- Rémi Denis-Courmont http://www.remlab.net/ http://fi.linkedin.com/in/remidenis -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Thu, Jun 03, 2010 at 10:09:45PM +0300, Rémi Denis-Courmont wrote: Le jeudi 3 juin 2010 22:00:13 Aurelien Jarno, vous avez écrit : I have found a machine with almost the same CPU, the only difference being the speed (3.00 GHz instead of 2.80 GHz). I am unable to reproduce the problem, I have run the testcase more than 20 times over last night. With SMT (HyperThread) support? Yes, with HyperThreading enabled. Maybe the problem is actually not in the GNU libc. What kernel are you running? Normally, I use upstream 2.6.32.15 at the moment. But I also hit the bug with Debian 2.6.32-5-686. I tried on a 2.6.26 kernel, I'll try to reproduce it with this kernel. If I fail, would it be possible to get a limited access to this machine to debug the issue? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le mardi 1 juin 2010 20:20:01 Aurelien Jarno, vous avez écrit : I am therefore reopening this bug as it may still be present, though we now have a different version and a different compiler. As I am unable to reproduce the original problem, I am unable to test this new version. Could you please test if version 2.11.1-2 is affected or not? It is. I hit the failure case after 8074 consecutive iterations. -- Rémi Denis-Courmont http://www.remlab.net/ http://fi.linkedin.com/in/remidenis -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Wed, Jun 02, 2010 at 11:28:50PM +0300, Rémi Denis-Courmont wrote: Le mardi 1 juin 2010 20:20:01 Aurelien Jarno, vous avez écrit : I am therefore reopening this bug as it may still be present, though we now have a different version and a different compiler. As I am unable to reproduce the original problem, I am unable to test this new version. Could you please test if version 2.11.1-2 is affected or not? It is. I hit the failure case after 8074 consecutive iterations. Ok, it's really bad, especially as I don't have a way to debug it... gcc-4.4 miscompiles something in this bug, and gcc-4.3 miscompiles else in bug#583858... Could you please give me more details about your CPU (cat /proc/cpuinfo), so that I can try to find a machine with the same CPU? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Wed, Jun 02, 2010 at 11:28:50PM +0300, Rémi Denis-Courmont wrote: Le mardi 1 juin 2010 20:20:01 Aurelien Jarno, vous avez écrit : I am therefore reopening this bug as it may still be present, though we now have a different version and a different compiler. As I am unable to reproduce the original problem, I am unable to test this new version. Could you please test if version 2.11.1-2 is affected or not? It is. I hit the failure case after 8074 consecutive iterations. In the original bug report, you said: I don't know. It reproduces pretty much 100% here: % ./a.out 1 2 a.out: test.c:18: cleanup_lock: Assertion `val == 0' failed. Abandon Does it mean it's a lot more difficult to reproduce it with this version? Have you tried to run so many iterations with the version built with gcc-4.3? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Processed: Re: Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Processing commands for cont...@bugs.debian.org: unarchive 551903 Bug #551903 {Done: Aurelien Jarno aure...@debian.org} [libc6-i686] libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation Unarchived Bug 551903 found 551903 2.11.1-2 Bug #551903 {Done: Aurelien Jarno aure...@debian.org} [libc6-i686] libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation Bug Marked as found in versions eglibc/2.11.1-2 and reopened. thanks Stopping processing here. Please contact me if you need assistance. -- 551903: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=551903 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Wed, Oct 21, 2009 at 10:40:03PM +0300, Rémi Denis-Courmont wrote: Le mercredi 21 octobre 2009 22:33:56, vous avez écrit : On Wed, Oct 21, 2009 at 07:11:40PM +0300, Remi Denis-Courmont wrote: Package: libc6-i686 Version: 2.10.1-1 Severity: critical Justification: breaks unrelated software Hello, With the upgrade to 2.10.1, pthread_cond_wait() fails to re-acquire the provided mutex when acting on a deferred cancellation event from another thread. This is seen if (and apparently, only if) another thread acquires the same mutex after cancellation is initiated, but before the cancelled thread executes cancellation cleanup handlers. I could not reproduce the problem with plain libc6. It only occurs with libc6-i686 installed. I wrote a simple test case at: http://www.remlab.net/files/divers/condfail.c This test shows the same behaviour on both lenny and sid version, that is it prints 1 and 2, but never triggers an assertion. Are there other conditions for this test to fail? I don't know. It reproduces pretty much 100% here: % ./a.out 1 2 a.out: test.c:18: cleanup_lock: Assertion `val == 0' failed. Abandon I'm running on a single core SMT (P4/HT namely), so instruction cycle timing might be very different from what an UP or non-SMT SMP gets :( In any case, the fact that is only occurs with libc6-i686 hints at incorrect use of atomic ops, I guess... Problems related to atomic ops often comes, or at least are triggered by, gcc changes. I have rebuilt eglibc 2.10.1-2 using gcc-4.3 instead of gcc-4.4. The packages are available on http://temp.aurel32.net/eglibc/ Could you please tell me if you have the same problem with them? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le lundi 26 octobre 2009 19:09:46 Aurelien Jarno, vous avez écrit : Thanks for the test. It's the solution I'll use if I can't find the real problem. Looking at the recent upstream commits, the problem may be fixed by this commit: http://repo.or.cz/w/glibc.git?a=commit;h=e73e694e38b7b222eec3ec5897eb507d88 bb8928 As I can't reproduce the problem here, if I build packages with this patch, would it be possible for you to test them? Yeah sure. -- Rémi Denis-Courmont http://www.remlab.net/ -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Mon, Oct 26, 2009 at 07:17:49PM +0200, Rémi Denis-Courmont wrote: Le lundi 26 octobre 2009 19:09:46 Aurelien Jarno, vous avez écrit : Thanks for the test. It's the solution I'll use if I can't find the real problem. Looking at the recent upstream commits, the problem may be fixed by this commit: http://repo.or.cz/w/glibc.git?a=commit;h=e73e694e38b7b222eec3ec5897eb507d88 bb8928 As I can't reproduce the problem here, if I build packages with this patch, would it be possible for you to test them? Yeah sure. Forget about it, we already have this patch in our tree :( I'll switch back to gcc 4.3 instead. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le lundi 26 octobre 2009 10:10:45 Aurelien Jarno, vous avez écrit : I'm running on a single core SMT (P4/HT namely), so instruction cycle timing might be very different from what an UP or non-SMT SMP gets :( In any case, the fact that is only occurs with libc6-i686 hints at incorrect use of atomic ops, I guess... Problems related to atomic ops often comes, or at least are triggered by, gcc changes. I have rebuilt eglibc 2.10.1-2 using gcc-4.3 instead of gcc-4.4. The packages are available on http://temp.aurel32.net/eglibc/ Could you please tell me if you have the same problem with them? Good catch. I could not reproduce the problem with 2.10.1-2+gcc4.3, neither with the test case nor with VLC media player. Thanks! -- Rémi Denis-Courmont http://www.remlab.net/ -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Mon, Oct 26, 2009 at 06:47:45PM +0200, Rémi Denis-Courmont wrote: Le lundi 26 octobre 2009 10:10:45 Aurelien Jarno, vous avez écrit : I'm running on a single core SMT (P4/HT namely), so instruction cycle timing might be very different from what an UP or non-SMT SMP gets :( In any case, the fact that is only occurs with libc6-i686 hints at incorrect use of atomic ops, I guess... Problems related to atomic ops often comes, or at least are triggered by, gcc changes. I have rebuilt eglibc 2.10.1-2 using gcc-4.3 instead of gcc-4.4. The packages are available on http://temp.aurel32.net/eglibc/ Could you please tell me if you have the same problem with them? Good catch. I could not reproduce the problem with 2.10.1-2+gcc4.3, neither with the test case nor with VLC media player. Thanks for the test. It's the solution I'll use if I can't find the real problem. Looking at the recent upstream commits, the problem may be fixed by this commit: http://repo.or.cz/w/glibc.git?a=commit;h=e73e694e38b7b222eec3ec5897eb507d88bb8928 As I can't reproduce the problem here, if I build packages with this patch, would it be possible for you to test them? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Package: libc6-i686 Version: 2.10.1-1 Severity: critical Justification: breaks unrelated software Hello, With the upgrade to 2.10.1, pthread_cond_wait() fails to re-acquire the provided mutex when acting on a deferred cancellation event from another thread. This is seen if (and apparently, only if) another thread acquires the same mutex after cancellation is initiated, but before the cancelled thread executes cancellation cleanup handlers. I could not reproduce the problem with plain libc6. It only occurs with libc6-i686 installed. I wrote a simple test case at: http://www.remlab.net/files/divers/condfail.c This is a violation of POSIX threads semantics, and a regression from earlier libc6-i686. This also renders VLC media player debug versions almost completely unusable. Best regards, -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (100, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.30.9 (SMP w/2 CPU cores) Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages libc6-i686 depends on: ii libc6 2.10.1-1 GNU C Library: Shared libraries libc6-i686 recommends no packages. libc6-i686 suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
On Wed, Oct 21, 2009 at 07:11:40PM +0300, Remi Denis-Courmont wrote: Package: libc6-i686 Version: 2.10.1-1 Severity: critical Justification: breaks unrelated software Hello, With the upgrade to 2.10.1, pthread_cond_wait() fails to re-acquire the provided mutex when acting on a deferred cancellation event from another thread. This is seen if (and apparently, only if) another thread acquires the same mutex after cancellation is initiated, but before the cancelled thread executes cancellation cleanup handlers. I could not reproduce the problem with plain libc6. It only occurs with libc6-i686 installed. I wrote a simple test case at: http://www.remlab.net/files/divers/condfail.c This test shows the same behaviour on both lenny and sid version, that is it prints 1 and 2, but never triggers an assertion. Are there other conditions for this test to fail? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le mercredi 21 octobre 2009 22:33:56, vous avez écrit : On Wed, Oct 21, 2009 at 07:11:40PM +0300, Remi Denis-Courmont wrote: Package: libc6-i686 Version: 2.10.1-1 Severity: critical Justification: breaks unrelated software Hello, With the upgrade to 2.10.1, pthread_cond_wait() fails to re-acquire the provided mutex when acting on a deferred cancellation event from another thread. This is seen if (and apparently, only if) another thread acquires the same mutex after cancellation is initiated, but before the cancelled thread executes cancellation cleanup handlers. I could not reproduce the problem with plain libc6. It only occurs with libc6-i686 installed. I wrote a simple test case at: http://www.remlab.net/files/divers/condfail.c This test shows the same behaviour on both lenny and sid version, that is it prints 1 and 2, but never triggers an assertion. Are there other conditions for this test to fail? I don't know. It reproduces pretty much 100% here: % ./a.out 1 2 a.out: test.c:18: cleanup_lock: Assertion `val == 0' failed. Abandon I'm running on a single core SMT (P4/HT namely), so instruction cycle timing might be very different from what an UP or non-SMT SMP gets :( In any case, the fact that is only occurs with libc6-i686 hints at incorrect use of atomic ops, I guess... -- Rémi Denis-Courmont http://www.remlab.net/ -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#551903: libc6-i686 pthread_cond_wait fails to reacquire mutex upon cancellation
Le mercredi 21 octobre 2009 22:40:03 Rémi Denis-Courmont, vous avez écrit : % ./a.out 1 2 a.out: test.c:18: cleanup_lock: Assertion `val == 0' failed. Abandon P.S.: For what it's worth val is EPERM here. That's why I assume the lock is not correctly re-acquired. -- Rémi Denis-Courmont http://www.remlab.net/ -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org