I am seeing this issue when trying to dump a state of an application.
Here is the state of the application when examined with gdb :
(gdb) info thread
Id Target Id Frame
3 Thread 0x422e1940 (LWP 12723) 0x00002b69d547b23f in mtcp_futex
(uaddr=0x2b69d5687fd8, op=0, val=2, timeout=0x2b69d547d280) at mtcp_futex.h:24
2 Thread 0x42ce2940 (LWP 12724) 0x00002b69d547b23f in mtcp_futex
(uaddr=0x22371848, op=0, val=5, timeout=0x0) at mtcp_futex.h:24
* 1 Thread 0x2b69d8b19f20 (LWP 12722) 0x00002b69d6d86541 in nanosleep ()
from /lib64/libc.so.6
(gdb) bt
#0 0x00002b69d6d86541 in nanosleep () from /lib64/libc.so.6
#1 0x00002b69d6db9ed4 in usleep () from /lib64/libc.so.6
#2 0x000000000040cdbc in AmberSimLoop::simLoop() ()
#3 0x000000000040799b in main ()
(gdb) thread 2
[Switching to thread 2 (Thread 0x42ce2940 (LWP 12724))]
#0 0x00002b69d547b23f in mtcp_futex (uaddr=0x22371848, op=0, val=5,
timeout=0x0) at mtcp_futex.h:24
24 asm volatile ("syscall"
(gdb) bt
#0 0x00002b69d547b23f in mtcp_futex (uaddr=0x22371848, op=0, val=5,
timeout=0x0) at mtcp_futex.h:24
#1 0x00002b69d547b1e4 in mtcp_state_futex (state=0x22371848, func=0, val=5,
timeout=0x0) at mtcp_state.c:47
#2 0x00002b69d54739a7 in stopthisthread (signum=12) at mtcp.c:3474
#3 <signal handler called>
#4 0x00002b69d6dc08a8 in epoll_wait () from /lib64/libc.so.6
#5 0x00000000007d7bcd in AmberPciePortHandler::handleSlaveRequests() ()
#6 0x000000000040ef59 in spawnPcieServer(void*) ()
#7 0x00002b69d5cd373d in start_thread () from /lib64/libpthread.so.0
#8 0x00002b69d546e957 in threadcloned (threadv=0x22371830) at mtcp.c:1231
#9 0x00002b69d6dc04bd in clone () from /lib64/libc.so.6
#10 0x0000000000000000 in ?? ()
(gdb) thread 3
[Switching to thread 3 (Thread 0x422e1940 (LWP 12723))]
#0 0x00002b69d547b23f in mtcp_futex (uaddr=0x2b69d5687fd8, op=0, val=2,
timeout=0x2b69d547d280) at mtcp_futex.h:24
24 asm volatile ("syscall"
(gdb) bt
#0 0x00002b69d547b23f in mtcp_futex (uaddr=0x2b69d5687fd8, op=0, val=2,
timeout=0x2b69d547d280) at mtcp_futex.h:24
#1 0x00002b69d547b1e4 in mtcp_state_futex (state=0x2b69d5687fd8, func=0,
val=2, timeout=0x2b69d547d280) at mtcp_state.c:47
#2 0x00002b69d546fc90 in checkpointhread (dummy=0x0) at mtcp.c:1998
#3 0x00002b69d5cd373d in start_thread () from /lib64/libpthread.so.0
#4 0x00002b69d546e957 in threadcloned (threadv=0x1b98fb70) at mtcp.c:1231
#5 0x00002b69d6dc04bd in clone () from /lib64/libc.so.6
#6 0x0000000000000000 in ?? ()
(gdb)
Apparently, and attempt to dump checkpoint was taken when the thread 1 was in
nanosleep() and the thread 2 in epoll_wait()
This resulted in a deadlock. Any ideas on what is going on ?
-Kosta
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum