Hi,

Last night we had one of our threaded builds on the trunk hang when running make check on the test opal_condition in test/threads/

After running the test about 30-40 times, I was only able to get it to hang once. Looking at it is gdb, we get:

(gdb) info threads
3 Thread 1084229984 (LWP 8450) 0x0000002a95e3bba9 in sched_yield () from /lib64/tls/libc.so.6
  2 Thread 1094719840 (LWP 8451)  0xffffffffff600012 in ?? ()
1 Thread 182904955328 (LWP 8430) 0x0000002a9567309b in pthread_join () from /lib64/tls/libpthread.so.0
(gdb) thread 2
[Switching to thread 2 (Thread 1094719840 (LWP 8451))]#0 0xffffffffff600012 in ?? ()
(gdb) bt
#0  0xffffffffff600012 in ?? ()
#1  0x0000000000000001 in ?? ()
#2  0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 182904955328 (LWP 8430))]#0 0x0000002a9567309b in pthread_join () from /lib64/tls/libpthread.so.0
(gdb) bt
#0  0x0000002a9567309b in pthread_join () from /lib64/tls/libpthread.so.0
#1 0x0000002a95794a7d in opal_thread_join () from /san/homedirs/mpiteam/mtt-runs/odin/20071204-Nightly/pb_2/installs/Bp80/src/openmpi-1.3a1r16847/opal/.libs/libopen-pal.so.0
#2  0x0000000000401684 in main ()
(gdb) thread 3
[Switching to thread 3 (Thread 1084229984 (LWP 8450))]#0 0x0000002a95e3bba9 in sched_yield () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95e3bba9 in sched_yield () from /lib64/tls/libc.so.6
#1  0x0000000000401216 in thr1_run ()
#2  0x0000002a95672137 in start_thread () from /lib64/tls/libpthread.so.0
#3  0x0000002a95e53113 in clone () from /lib64/tls/libc.so.6
(gdb)


I know, this is not very helpful, but I have no idea what is going on. There have been no changes in this code area for a long time.

Has anyone else seen something like this? Any ideas what is going on?

Thanks,

Tim

Reply via email to