Hi Eliot, Thanks for writing back to us. From the error message, it looks like /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.79-2.5.5.2.el7_1.x86_64/jre/lib/ext/pulse-java.jar is not available and DMTCP is trying to recreate it and failing due to lack of permissions. I guess the reason for the file not present is that the Java version is different on the target machine. Is that correct?
We can provide you a patch for now, that would allow you to restart your ckpt images. The general idea is to ask DMTCP to restore the file in DMTCP's temp directory and go from there. But let's first confirm the diagnosis. Kapil On Thu, Jul 23, 2015 at 1:42 PM, Eliot Moss <m...@cs.umass.edu> wrote: > Dear DMTCP team -- Notwithstanding the bug reported quite some time ago > (how's the fix coming? ), I have no encountered a different failure > to restart: > > [47995] mtcp_restart.c:1003 read_shared_memory_area_from_file: > mapping /tmp/hsperfdata_moss/45000 with data from ckpt image > [47995] mtcp_restart.c:1296 open_shared_file: > unable to create file > > /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.79-2.5.5.2.el7_1.x86_64/jre/lib/ext/pulse-java.jar > > This causes a core dump. I'm not sure what to do -- my queue is currently > limited > to two weeks of execution time and this job timed out at the two weeks, > and with > this limitation apparently cannot be restarted ... > > Regards -- Eliot Moss > > > ------------------------------------------------------------------------------ > _______________________________________________ > Dmtcp-forum mailing list > Dmtcp-forum@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum >
------------------------------------------------------------------------------
_______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum