I've worked on this a little more and have found the following things. I am now
focusing on using dmtcp version 2.4.0-rc2 with Ubuntu 14.04 (kernel version
3.13.0), inside VirtualBox 4.3.20.
(1) After peering at autotest.py, I discovered that I can get screen to load
successfully by using "dmtcp_launch env screen". If I just call "dmtcp_launch
screen", it fails with:
[27607] ERROR at dmtcpnohijackstubs.cpp:59 in dmtcp_get_tmpdir;
REASON='JASSERT(false) failed'
Message: NOT REACHED
dmtcp_launch (27607): Terminating...
I'm not sure why "env screen" works but "screen" doesn't, but I can live with
it.
(2) "make check" seems to have unpredictable results. I ran it four times in a
row and got different failures each time. I did not change the environment
between these operations -- I just typed "make check" each time. (I think I
typed ahead a little during the second and third tries; the typeahead buffer
could have interfered with some of the commands that were tested, but that
seems unlikely.)
First try, everything passed except the following:
syscall-tester ckpt:FAILED
root-pids: [6564] msg: checkpoint error, 1 expected, 1 found,
running=1
Trying to kill old coordinator, and launch new one on same port
pthread2 ckpt:PASSED rstr:FAILED (first process rec'd signal 11) retry:
***** Copied checkpoint images to
/tmp/dmtcp-matthias@ubuntu/dmtcp-autotest-232161700
FAILED
root-pids: [11070] msg: restart error, 1 expected, 0 found,
running=0
dash ckpt:FAILED
root-pids: [18542] msg: unexpected number of checkpoint files, 2
procs, 1 files
screen ckpt:PASSED rstr:FAILED (first process didn't die) retry:PASSED;
ckpt:FAILED
root-pids: [18641] msg: checkpoint error, 3 expected, 1 found,
running=1
Second try, everything passed except the following:
bash ckpt:FAILED
root-pids: [30994] msg: unexpected number of checkpoint files, 2
procs, 1 files
dash ckpt:FAILED
root-pids: [31004] msg: unexpected number of checkpoint files, 2
procs, 1 files
Third try, everything passed except the following:
python ckpt:PASSED rstr:PASSED; ckpt:FAILED
root-pids: [14319] msg: checkpoint error, 1 expected, 0 found,
running=0
bash ckpt:FAILED
root-pids: [14323] msg: unexpected number of checkpoint files, 2
procs, 1 files
dash ckpt:FAILED
root-pids: [14333] msg: unexpected number of checkpoint files, 2
procs, 1 files
Fourth try, everything passed except the following: (there was nothing in the
typeahead buffer this time)
dmtcp5 ckpt:FAILED
root-pids: [14785] msg: checkpoint error, 2 expected, 2 found,
running=0
bash ckpt:FAILED
root-pids: [27249] msg: checkpoint error, 2 expected, 1 found,
running=1
dash ckpt:FAILED
root-pids: [27256] msg: unexpected number of checkpoint files, 2
procs, 1 files
openmp-2 ckpt:PASSED rstr:PASSED; ckpt:FAILED
root-pids: [27406] msg: checkpoint error, 1 expected, 0 found,
running=0
I don't know why this happens, but I guess I shouldn't give too much weight to
the results of "make check".
Other than this, dmtcp seems to be working pretty well in this environment.
Thanks for any advice you can give on "dmtcp_launch screen" and "make check".
Matthias
On Apr 15, 2015, at 2:25 PM, Matthias Fripp wrote:
> I am trying to identify versions of DMTCP and Linux that work well together.
>
> At the moment, I am trying to get dmtcp to work with GNU screen, so I can
> launch a couple of processes and then checkpoint and restore them together.
> However, every time I try "dmtcp_launch screen", I get a message, "[6345]
> ERROR at dmtcpnohijackstubs.cpp:59 in dmtcp_get_tmpdir;
> REASON='JASSERT(false) failed'\nMessage: NOT REACHED". I've looked in the
> source code, and this seems to be a stub function that prevents using
> dmtcp_launch to launch dmtcp-aware applications. But I don't think my copy of
> screen is dmtcp-aware, since nothing happens in the coordinator when I launch
> screen alone. I've also had other problems with dmtcp on both of these
> platforms, but I want to try to resolve this one first.
>
> I'm doing this work on a virtual machine, so I have a lot of leeway in how I
> set it up. But I can't find any Linux distribution that seems to work
> perfectly with any version of DMTCP. I have tried Ubuntu 14.04 (kernel
> version 3.13.0) and Debian Wheezy (kernel version 3.16.0). Both of these are
> 64-bit distributions. In both cases, I installed screen with "sudo apt-get
> install screen".
>
> I downloaded dmtcp versions 2.3.1 and 2.4.0-rc2 and tried "make check" with
> both dmtcp versions on both Linux distributions. In general, dmtcp seemed to
> work better with Ubuntu, but both versions of dmtcp failed several
> (different) tests on both Linux distributions. And I get the same error
> trying to "dmtcp_launch screen" in all cases. This is despite the fact that
> the "screen" tests say they pass on Ubuntu with dmtcp 2.3.1 or Debian with
> dmtcp 2.4.0-rc2.
>
> I noticed at https://github.com/dmtcp/dmtcp/wiki/DMTCP-Release-Guide that the
> first step of releasing dmtcp is "Ensure that you can build and pass all the
> tests." So I'm hopeful that there is some version of dmtcp that works
> reliably with some version of Linux.
>
> Would anyone be able to tell me which combination of Linux distribution and
> dmtcp version is known to be work well? I would also appreciate advice on the
> problems launching GNU screen with dmtcp, but I thought I'd start by getting
> my system setup properly.
>
> Thanks,
>
> Matthias
>
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum