Hi,

I am running OpenSUSE 13.2 (64-bit) inside VirtualBox and I am trying to get 
DMTCP 2.3.1 to work on the simple example 'dmtcp1.c' as supplied in the source 
tar.gz distribution of the package (dmtcp-2.3.1/test/dmtcp1.c). There are no 
complaints from DMTCP when writing the checkpoint. However, the dmtcp_restart 
fails to restart the program with the error message "only read 0 bytes instead 
of 4096 from checkpoint file". 

What am I doing wrong and what should I do to make it work?

Some details:
I have tried it with an installation as built directly from the source 
distribution and I tried it with the binary distribution as supplied by 
OpenSUSE 13.2: It doesn't make any difference.
http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/dmtcp-2.3.1-2.2.2.x86_64.rpm
http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/dmtcp-devel-2.3.1-2.2.2.x86_64.rpm

--- start console log ---
wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> gcc -fPIC -o dmtcp1 dmtcp1.c
wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> dmtcp_checkpoint ./dmtcp1
  1   2   3   4   5   6   7   8   9 ^C
wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> dmtcp_restart 
ckpt_dmtcp1_1b03894b1c5e0f7-40000-54e602b1.dmtcp 
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:235 mtcp_readfile:
  only read 0 bytes instead of 4096 from checkpoint file
[27865] mtcp_util.ic:237 mtcp_readfile:
   failed to read after 10 tries in a row.
Segmentation fault
wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> 
--- end console log ---

--- start dmtcp_coordinator log ---
wijnand@linux-6ea7:~/bla/dmtcp-2.3.1/test> dmtcp_coordinator 
dmtcp_coordinator (DMTCP) 2.3.1
License LGPLv3+: GNU LGPL version 3 or later
    <http://gnu.org/licenses/lgpl.html>.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see COPYING file for details.
(Use flag "-q" to hide this message.)

dmtcp_coordinator starting...
    Host: linux-6ea7.site (0.0.0.0)
    Port: 7779
    Checkpoint Interval: disabled (checkpoint manually instead)
    Exit on last client: 0
Type '?' for help.

[27848] NOTE at dmtcp_coordinator.cpp:1040 in onConnect; REASON='worker 
connected'
     hello_remote.from = 1b03894b1c5e0f7-27851-54e602b1
[27848] NOTE at dmtcp_coordinator.cpp:825 in onData; REASON='Updating process 
Information after exec()'
     progname = dmtcp1
     msg.from = 1b03894b1c5e0f7-40000-54e602b1
     client->identity() = 1b03894b1c5e0f7-27851-54e602b1
c
[27848] NOTE at dmtcp_coordinator.cpp:1271 in startCheckpoint; REASON='starting 
checkpoint, suspending all nodes'
     s.numPeers = 1
[27848] NOTE at dmtcp_coordinator.cpp:1273 in startCheckpoint; 
REASON='Incremented Generation'
     compId.generation() = 1
[27848] NOTE at dmtcp_coordinator.cpp:615 in updateMinimumState; 
REASON='locking all nodes'
[27848] NOTE at dmtcp_coordinator.cpp:621 in updateMinimumState; 
REASON='draining all nodes'
[27848] NOTE at dmtcp_coordinator.cpp:627 in updateMinimumState; 
REASON='checkpointing all nodes'
[27848] NOTE at dmtcp_coordinator.cpp:641 in updateMinimumState; 
REASON='building name service database'
[27848] NOTE at dmtcp_coordinator.cpp:657 in updateMinimumState; 
REASON='entertaining queries now'
[27848] NOTE at dmtcp_coordinator.cpp:662 in updateMinimumState; 
REASON='refilling all nodes'
[27848] NOTE at dmtcp_coordinator.cpp:693 in updateMinimumState; 
REASON='restarting all nodes'
[27848] NOTE at dmtcp_coordinator.cpp:875 in onDisconnect; REASON='client 
disconnected'
     client->identity() = 1b03894b1c5e0f7-40000-54e602b1
[27848] NOTE at dmtcp_coordinator.cpp:1096 in validateRestartingWorkerProcess; 
REASON='FIRST dmtcp_restart connection.  Set numPeers. Generate timestamp'
     numPeers = 1
     curTimeStamp = 22789762212
     compId = 1b03894b1c5e0f7-40000-54e602b1
[27848] NOTE at dmtcp_coordinator.cpp:1040 in onConnect; REASON='worker 
connected'
     hello_remote.from = 1b03894b1c5e0f7-40000-54e602b1
[27848] NOTE at dmtcp_coordinator.cpp:875 in onDisconnect; REASON='client 
disconnected'
     client->identity() = 1b03894b1c5e0f7-40000-54e602b1
--- end dmtcp_coordinator log ---


Kind regards,
Wijnand Suijlen

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to