Hi Matthias,

Thanks so much for the bug report. This is really helpful. We definitely
want to release a robust software and your testing definitely brings us one
step closer.

Ideally, DMTCP should work with any recent distro. If something is not
working, this is a bug and we would like to fix it. Since you have already
been able to reproduce some failures, would it be possible for you to share
your VM(s) with us? That would help us reduce the reproduction time on our
end and we can start looking into the issues right away.

Best,
Kapil

On Thu, Apr 16, 2015 at 5:35 PM, Matthias Fripp <mfr...@hawaii.edu> wrote:

> I've worked on this a little more and have found the following things. I
> am now focusing on using dmtcp version 2.4.0-rc2 with Ubuntu 14.04 (kernel
> version 3.13.0), inside VirtualBox 4.3.20.
>
> (1) After peering at autotest.py, I discovered that I can get screen to
> load successfully by using "dmtcp_launch env screen". If I just call
> "dmtcp_launch screen", it fails with:
>
> [27607] ERROR at dmtcpnohijackstubs.cpp:59 in dmtcp_get_tmpdir;
> REASON='JASSERT(false) failed'
> Message: NOT REACHED
> dmtcp_launch (27607): Terminating...
>
> I'm not sure why "env screen" works but "screen" doesn't, but I can live
> with it.
>
> (2) "make check" seems to have unpredictable results. I ran it four times
> in a row and got different failures each time. I did not change the
> environment between these operations -- I just typed "make check" each
> time. (I think I typed ahead a little during the second and third tries;
> the typeahead buffer could have interfered with some of the commands that
> were tested, but that seems unlikely.)
>
> First try, everything passed except the following:
>
> syscall-tester ckpt:FAILED
>                root-pids: [6564] msg: checkpoint error, 1 expected, 1
> found, running=1
> Trying to kill old coordinator, and launch new one on same port
> pthread2       ckpt:PASSED rstr:FAILED (first process rec'd signal 11)
> retry:
> ***** Copied checkpoint images to /tmp/dmtcp-matthias@ubuntu
> /dmtcp-autotest-232161700
> FAILED
>                root-pids: [11070] msg: restart error, 1 expected, 0 found,
> running=0
> dash           ckpt:FAILED
>                root-pids: [18542] msg: unexpected number of checkpoint
> files, 2 procs, 1 files
> screen         ckpt:PASSED rstr:FAILED (first process didn't die)
> retry:PASSED; ckpt:FAILED
>                root-pids: [18641] msg: checkpoint error, 3 expected, 1
> found, running=1
>
>
> Second try, everything passed except the following:
>
> bash           ckpt:FAILED
>                root-pids: [30994] msg: unexpected number of checkpoint
> files, 2 procs, 1 files
> dash           ckpt:FAILED
>                root-pids: [31004] msg: unexpected number of checkpoint
> files, 2 procs, 1 files
>
>
> Third try, everything passed except the following:
>
> python         ckpt:PASSED rstr:PASSED; ckpt:FAILED
>                root-pids: [14319] msg: checkpoint error, 1 expected, 0
> found, running=0
> bash           ckpt:FAILED
>                root-pids: [14323] msg: unexpected number of checkpoint
> files, 2 procs, 1 files
> dash           ckpt:FAILED
>                root-pids: [14333] msg: unexpected number of checkpoint
> files, 2 procs, 1 files
>
>
> Fourth try, everything passed except the following: (there was nothing in
> the typeahead buffer this time)
>
> dmtcp5         ckpt:FAILED
>                root-pids: [14785] msg: checkpoint error, 2 expected, 2
> found, running=0
> bash           ckpt:FAILED
>                root-pids: [27249] msg: checkpoint error, 2 expected, 1
> found, running=1
> dash           ckpt:FAILED
>                root-pids: [27256] msg: unexpected number of checkpoint
> files, 2 procs, 1 files
> openmp-2       ckpt:PASSED rstr:PASSED; ckpt:FAILED
>                root-pids: [27406] msg: checkpoint error, 1 expected, 0
> found, running=0
>
>
> I don't know why this happens, but I guess I shouldn't give too much
> weight to the results of "make check".
>
> Other than this, dmtcp seems to be working pretty well in this environment.
>
> Thanks for any advice you can give on "dmtcp_launch screen" and "make
> check".
>
> Matthias
>
> On Apr 15, 2015, at 2:25 PM, Matthias Fripp wrote:
>
> I am trying to identify versions of DMTCP and Linux that work well
> together.
>
> At the moment, I am trying to get dmtcp to work with GNU screen, so I can
> launch a couple of processes and then checkpoint and restore them together.
> However, every time I try  "dmtcp_launch screen", I get a message, "[6345]
> ERROR at dmtcpnohijackstubs.cpp:59 in dmtcp_get_tmpdir;
> REASON='JASSERT(false) failed'\nMessage: NOT REACHED". I've looked in the
> source code, and this seems to be a stub function that prevents using
> dmtcp_launch to launch dmtcp-aware applications. But I don't think my copy
> of screen is dmtcp-aware, since nothing happens in the coordinator when I
> launch screen alone. I've also had other problems with dmtcp on both of
> these platforms, but I want to try to resolve this one first.
>
> I'm doing this work on a virtual machine, so I have a lot of leeway in how
> I set it up. But I can't find any Linux distribution that seems to work
> perfectly with any version of DMTCP. I have tried Ubuntu 14.04 (kernel
> version 3.13.0) and Debian Wheezy (kernel version 3.16.0). Both of these
> are 64-bit distributions. In both cases, I installed screen with "sudo
> apt-get install screen".
>
> I downloaded dmtcp versions 2.3.1 and 2.4.0-rc2 and tried "make check"
> with both dmtcp versions on both Linux distributions. In general, dmtcp
> seemed to work better with Ubuntu, but both versions of dmtcp failed
> several (different) tests on both Linux distributions. And I get the same
> error trying to "dmtcp_launch screen" in all cases. This is despite the
> fact that the "screen" tests say they pass on Ubuntu with dmtcp 2.3.1 or
> Debian with dmtcp 2.4.0-rc2.
>
> I noticed at https://github.com/dmtcp/dmtcp/wiki/DMTCP-Release-Guide that
> the first step of releasing dmtcp is "Ensure that you can build and pass
> all the tests." So I'm hopeful that there is some version of dmtcp that
> works reliably with some version of Linux.
>
> Would anyone be able to tell me which combination of Linux distribution
> and dmtcp version is known to be work well? I would also appreciate advice
> on the problems launching GNU screen with dmtcp, but I thought I'd start by
> getting my system setup properly.
>
> Thanks,
>
> Matthias
>
>
>
>
> ------------------------------------------------------------------------------
> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> Develop your own process in accordance with the BPMN 2 standard
> Learn Process modeling best practices with Bonita BPM through live
> exercises
> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
> event?utm_
> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
> _______________________________________________
> Dmtcp-forum mailing list
> Dmtcp-forum@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
>
>
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to