Hi Glen, The error is obviously some bug in the internal bookkeeping of DMTCP. The logs from the coordinator don't indicate much. Could you retry your test after applying the following patch and configuring and building DMTCP with `--enable-debug`? The logs from dmtcp_launch could help us identify the bug.
diff --git a/src/plugin/svipc/sysvipcwrappers.cpp b/src/plugin/svipc/sysvipcwrappers.cpp index bc91609..661d660 100644 --- a/src/plugin/svipc/sysvipcwrappers.cpp +++ b/src/plugin/svipc/sysvipcwrappers.cpp @@ -179,7 +179,7 @@ int semtimedop(int semid, struct sembuf *sops, size_t nsops, (timeout != NULL && TIMESPEC_CMP(timeout, &ts_100ms, <))) { DMTCP_PLUGIN_DISABLE_CKPT(); realId = VIRTUAL_TO_REAL_SEM_ID(semid); - JASSERT(realId != -1); + JASSERT(realId != -1)(semid); ret = _real_semtimedop(realId, sops, nsops, timeout); if (ret == 0) { SysVSem::instance().on_semop(semid, sops, nsops); Thanks, Rohan ----- Original Message ----- From: "Glen MacLachlan" <macl...@gwu.edu> To: "dmtcp-forum" <dmtcp-forum@lists.sourceforge.net> Sent: Wednesday, June 29, 2016 1:24:12 PM Subject: Re: [Dmtcp-forum] DMTCP + SAS Hi. Just wanted to ping the list again to see if someone has any idea how to work around this issue? Best, Glen Hi, I'm trying to run SAS with DMTCP and I get the following error message immediately on startup and then SAS crashes: $ dmtcp_launch sas [40000] ERROR at sysvipcwrappers.cpp:181 in semtimedop; REASON='JASSERT(realId != -1) failed' sas (40000): Terminating... I'm not really sure what to make of the error message and my Google searches haven't turned up much in terms of useful information. Any ideas would be greatly appreciated. The output of the dmtcp_coordinator is much more verbose: [10809] NOTE at dmtcp_coordinator.cpp:1664 in updateCheckpointInterval; REASON='CheckpointInterval updated (for this computation only)' oldInterval = 0 theCheckpointInterval = 0 [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-10810-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating process Information after exec()' progname = bash msg.from = 5b712d21ff01c167-40000-57714f8b client->identity() = 5b712d21ff01c167-10810-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating process Information after fork()' client->hostname() = login4 client->progname() = bash_(forked) msg.from = 5b712d21ff01c167-41000-57714f8b client->identity() = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-41000-57714f8b client->progname() = bash_(forked) [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating process Information after fork()' client->hostname() = login client->progname() = bash_(forked) msg.from = 5b712d21ff01c167-42000-57714f8b client->identity() = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-42000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-42000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating process Information after fork()' client->hostname() = login4 client->progname() = bash_(forked) msg.from = 5b712d21ff01c167-43000-57714f8b client->identity() = 5b712d21ff01c167-42000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating process Information after fork()' client->hostname() = login4 client->progname() = bash_(forked) msg.from = 5b712d21ff01c167-44000-57714f8b client->identity() = 5b712d21ff01c167-42000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-43000-57714f8b client->progname() = bash_(forked) [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-44000-57714f8b client->progname() = bash_(forked) [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-42000-57714f8b client->progname() = bash_(forked) [10809] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating process Information after exec()' progname = sas msg.from = 5b712d21ff01c167-40000-57714f8b client->identity() = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating process Information after fork()' client->hostname() = login4 client->progname() = sas_(forked) msg.from = 5b712d21ff01c167-45000-57714f8c client->identity() = 5b712d21ff01c167-40000-57714f8b [10809] NOTE at dmtcp_coordinator.cpp:867 in onData; REASON='Updating process Information after exec()' progname = elssrv msg.from = 5b712d21ff01c167-45000-57714f8c client->identity() = 5b712d21ff01c167-45000-57714f8c [10809] NOTE at dmtcp_coordinator.cpp:1079 in onConnect; REASON='worker connected' hello_remote.from = 5b712d21ff01c167-45000-57714f8c [10809] NOTE at dmtcp_coordinator.cpp:858 in onData; REASON='Updating process Information after fork()' client->hostname() = login4 client->progname() = elssrv_(forked) msg.from = 5b712d21ff01c167-46000-57714f8c client->identity() = 5b712d21ff01c167-45000-57714f8c [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-46000-57714f8c client->progname() = elssrv_(forked) [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-45000-57714f8c client->progname() = elssrv [10809] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 5b712d21ff01c167-40000-57714f8b client->progname() = sas Best, Glen ------------------------------------------------------------------------------ Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum ------------------------------------------------------------------------------ Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum