Hi Wijnand, Could you please try the following steps?
1) Download the source tarball from: http://sourceforge.net/projects/dmtcp/files/dmtcp-2.x/2.3.1/ 2) ./configure && make && make check-dmtcp1 3) Verify that the test passes Also, could you please verify that the checkpoint image is of non-zero size? It could be that dmtcp_launch is failing to create a valid checkpoint image. Thanks, Rohan ----- Original Message ----- From: "Kapil Arya" <kapil.arya...@gmail.com> To: "Wijnand SUIJLEN" <wijnand.suij...@huawei.com> Cc: "dmtcp-forum" <dmtcp-forum@lists.sourceforge.net> Sent: Friday, February 20, 2015 11:04:04 AM Subject: Re: [Dmtcp-forum] dmtcp_restart (dmtcp 2.3.1) won't restart from checkpoint on OpenSUSE 13.2 Rohan/Jiajun, Can you take a look at it? Best, Kapil On Thu, Feb 19, 2015 at 11:40 AM, Wijnand SUIJLEN < wijnand.suij...@huawei.com > wrote: Hi, I am running OpenSUSE 13.2 (64-bit) inside VirtualBox and I am trying to get DMTCP 2.3.1 to work on the simple example 'dmtcp1.c' as supplied in the source tar.gz distribution of the package (dmtcp-2.3.1/test/dmtcp1.c). There are no complaints from DMTCP when writing the checkpoint. However, the dmtcp_restart fails to restart the program with the error message "only read 0 bytes instead of 4096 from checkpoint file". What am I doing wrong and what should I do to make it work? Some details: I have tried it with an installation as built directly from the source distribution and I tried it with the binary distribution as supplied by OpenSUSE 13.2: It doesn't make any difference. http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/dmtcp-2.3.1-2.2.2.x86_64.rpm http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/dmtcp-devel-2.3.1-2.2.2.x86_64.rpm --- start console log --- wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> gcc -fPIC -o dmtcp1 dmtcp1.c wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> dmtcp_checkpoint ./dmtcp1 1 2 3 4 5 6 7 8 9 ^C wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> dmtcp_restart ckpt_dmtcp1_1b03894b1c5e0f7-40000-54e602b1.dmtcp [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:235 mtcp_readfile: only read 0 bytes instead of 4096 from checkpoint file [27865] mtcp_util.ic:237 mtcp_readfile: failed to read after 10 tries in a row. Segmentation fault wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> --- end console log --- --- start dmtcp_coordinator log --- wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> dmtcp_coordinator dmtcp_coordinator (DMTCP) 2.3.1 License LGPLv3+: GNU LGPL version 3 or later < http://gnu.org/licenses/lgpl.html >. This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see COPYING file for details. (Use flag "-q" to hide this message.) dmtcp_coordinator starting... Host: linux-6ea7.site (0.0.0.0) Port: 7779 Checkpoint Interval: disabled (checkpoint manually instead) Exit on last client: 0 Type '?' for help. [27848] NOTE at dmtcp_coordinator.cpp:1040 in onConnect; REASON='worker connected' hello_remote.from = 1b03894b1c5e0f7-27851-54e602b1 [27848] NOTE at dmtcp_coordinator.cpp:825 in onData; REASON='Updating process Information after exec()' progname = dmtcp1 msg.from = 1b03894b1c5e0f7-40000-54e602b1 client->identity() = 1b03894b1c5e0f7-27851-54e602b1 c [27848] NOTE at dmtcp_coordinator.cpp:1271 in startCheckpoint; REASON='starting checkpoint, suspending all nodes' s.numPeers = 1 [27848] NOTE at dmtcp_coordinator.cpp:1273 in startCheckpoint; REASON='Incremented Generation' compId.generation() = 1 [27848] NOTE at dmtcp_coordinator.cpp:615 in updateMinimumState; REASON='locking all nodes' [27848] NOTE at dmtcp_coordinator.cpp:621 in updateMinimumState; REASON='draining all nodes' [27848] NOTE at dmtcp_coordinator.cpp:627 in updateMinimumState; REASON='checkpointing all nodes' [27848] NOTE at dmtcp_coordinator.cpp:641 in updateMinimumState; REASON='building name service database' [27848] NOTE at dmtcp_coordinator.cpp:657 in updateMinimumState; REASON='entertaining queries now' [27848] NOTE at dmtcp_coordinator.cpp:662 in updateMinimumState; REASON='refilling all nodes' [27848] NOTE at dmtcp_coordinator.cpp:693 in updateMinimumState; REASON='restarting all nodes' [27848] NOTE at dmtcp_coordinator.cpp:875 in onDisconnect; REASON='client disconnected' client->identity() = 1b03894b1c5e0f7-40000-54e602b1 [27848] NOTE at dmtcp_coordinator.cpp:1096 in validateRestartingWorkerProcess; REASON='FIRST dmtcp_restart connection. Set numPeers. Generate timestamp' numPeers = 1 curTimeStamp = 22789762212 compId = 1b03894b1c5e0f7-40000-54e602b1 [27848] NOTE at dmtcp_coordinator.cpp:1040 in onConnect; REASON='worker connected' hello_remote.from = 1b03894b1c5e0f7-40000-54e602b1 [27848] NOTE at dmtcp_coordinator.cpp:875 in onDisconnect; REASON='client disconnected' client->identity() = 1b03894b1c5e0f7-40000-54e602b1 --- end dmtcp_coordinator log --- Kind regards, Wijnand Suijlen ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum