Hello, DMTCP Team, I encountered a problem when trying to checkpoint/restart a VM on the same node using DMTCP and some plugins in the package.
Here is some information on setup. 1. DMTCP version 2.4.5 2. ./configure --enable-infiniband-support && make && make install 3. In contrib/kvm, do make and get dmtcp_kvmhijack.so In contrib/tun, do make and get dmtcp_tunhijack.so 4. VM startup command: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4096 -hda /home/mig/vm1.qcow2 -net nic,macaddr=52-54-00-12-32-2,model=virtio -net tap,ifname=tap0,script=no The steps in my test: (I went through your cluster'13 paper and slides, I believe this is how you guys run. Please let me know if I'm wrong though.) 1. Run ./dmtcp_coordinator first; 2. In another terminal, run ./dmtcp_launch --with-plugin /home/mig/dmtcp-2.4.5/contrib/kvm/dmtcp_kvmhijack.so:/home/mig/dmtcp-2.4.5/contrib/tun/dmtcp_tunhijack.so qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4096 -hda /home/mig/vm1.qcow2 -net nic,macaddr=52-54-00-12-32-2,model=virtio -net tap,ifname=tap0,script=no 3. Then execute checkpoint manually in the first terminal (by pressing c), and I get the following output [6818] NOTE at dmtcp_coordinator.cpp:1291 in startCheckpoint; REASON='starting checkpoint, suspending all nodes' s.numPeers = 1 [6818] NOTE at dmtcp_coordinator.cpp:1293 in startCheckpoint; REASON='Incremented computationGeneration' compId.computationGeneration() = 1 [6818] NOTE at dmtcp_coordinator.cpp:917 in onDisconnect; REASON='client disconnected' client->identity() = 462a1597e64cd8e1-40000-57e4cbfc client->progname() = qemu-system-x86_64 4. The VM can be launched correctly. However, it failed after manual checkpointing with the following error. And there's no any checkpoint file got generated. [40000] ERROR at procselfmaps.cpp:214 in getNextArea; REASON='JASSERT(data[dataIdx++] == '\n') failed' qemu-system-x86_64 (40000): Terminating... Also, I got the same error when removing the network configuration of VM (-net nic ... script=no) and tun plugin. Can you please take a look at the problem? Any help is really appreciated. Thanks,
------------------------------------------------------------------------------
_______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum