Hi all,
I've run into several segfaults when restarting my application with
dmtcp_restart. The error is shown below (I apologize, not very useful):
[workers6730-1:40000] Signal: Segmentation fault (11)
[workers6730-1:40000] Signal code: (128)
[workers6730-1:40000] Failing at address: (nil)
[workers6730-1:40000] [ 0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f762e4518d0]
[workers6730-1:40000] [ 1]
/lib/x86_64-linux-gnu/libc.so.6(__poll+0x2d)[0x7f762e176d3d]
[workers6730-1:40000] [ 2]
/home/john/local/dmtcp_install/lib/dmtcp/libdmtcp_ipc.so(poll+0x31)[0x7f762ffc8c41]
[workers6730-1:40000] [ 3]
/home/john/local/openmpi-1.10.2_install/lib/libopen-pal.so.13(+0x6a658)[0x7f762f1e4658]
[workers6730-1:40000] [ 4]
/home/john/local/openmpi-1.10.2_install/lib/libopen-pal.so.13(opal_libevent2021_event_base_loop+0x1b2)[0x7f762f1dc2b2]
[workers6730-1:40000] [ 5] mpirun[0x404d24]
[workers6730-1:40000] [ 6] mpirun[0x4035e6]
[workers6730-1:40000] [ 7]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f762e0b8b45]
[workers6730-1:40000] [ 8] mpirun[0x4034f9]
I'm using the master version of dmtcp as well as OpenMPI V. 1.10.2. The
command I use to launch dmtcp is:
~/local/dmtcp_install/bin/dmtcp_launch --interval 6400 --no-gzip -q -q
mpirun -np 16 ~/local/mpb_install/bin/mpb-mpi interactive?=false
k-split-index=1 k-split-num=3 "/home/john/scratch/working/FA211_3.ctl" >>
mpb_out
And to restart, I simply call ./dmtcp_restart_script.sh
I do not launch a separate coordinator, I allow dmtcp_launch to handle the
coordinator. Most of the time, the restart is successful, but every so
often I get this error.
Any idea what might be going on here? Any advice would be greatly
appreciated.
Thanks in advance!
John
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum