Hi Calvin, Thanks for this feedback. Concerning rc3, there are still some issues with a 32-bit O/S. You wrote that you were testing Fedora 19 on the Atom N550. Is that running in 32-bit mode?
Also, just to confirm, I assume that you are interrupting a function prior to checkpointing, and then checkpointing and restarting. It's in this case that you see a segfault on restart. Is this correct? rc2 and rc3 have known bugs in terms of supporting 32-bit mode. After rc1, we changed our restart algorithm a little. In the next few updates this week, we're hoping to fix the 32-bit mode. Thank you for the further details. We'll especially look into why ocaml should be more sensitive than R/python. Best wishes, - Gene On Sun, Apr 26, 2015 at 05:07:40AM -0400, Calvin Ostrum wrote: > Latest results on my experimentation here. Short form: rc1 seems to > work on Atom N550, but still not rc2 or rc3. > > For some reason, finally seem to have gotten a saved checkpoint that > works all the time on the Fedora 20 (kernel 3.17.7) i3 540, with rc2. > Ran it about 50 times in a row with no segfault. > > Also have tried building and running each of 2.3.1 and 2.4.0 rc1, > rc2,rc3 on the Fedora 19 (kernel 3.14.23) Atom N550. > > Results: 2.3.1 works like the packaged version, but that means > control-c ends all checkpointed shells tested with (R,python,ocaml). > > rc2 and rc3 (just put up for download hours ago) still crash on every > checkpointed shell at the same place shown with strace. > > However, *rc1* *works* each time I try on each of these shells, and > handles control-C correctly with R and Python. However, with ocaml, > when one hits control-C, ocaml prints its "interrupted" message (so > the signal does get to it fine) but then the whole thing quits. > > Now, I noticed that ocaml is actually bytecode for the ocaml > interpreter, run by the shell using a bang line. That could easily > mess things up, I suppose. So I tried a checkpoint running the ocaml > bytecode interpeter directly, passing ocaml into it. And... that > seems to work. Tried it many times in a row. Runs okay (after > loading in a huge ocaml program) and control-c works as it should. > > So hopefully this is some kind of clue. I assume rc1 works on this > system. But, rc2 and rc3 don't. Without any understanding of what > these programs truly do I don't know what other information I can > provide but will try to provide more if told what I could gather. > Maybe it is just something in the configure/make that differs? ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Dmtcp-forum mailing list Dmtcp-forum@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dmtcp-forum