Hi Calvin,
    Thanks for this feedback.  Concerning rc3, there are still
some issues with a 32-bit O/S.  You wrote that you were testing
Fedora 19 on the Atom N550.  Is that running in 32-bit mode?

Also, just to confirm, I assume that you are interrupting a function
prior to checkpointing, and then checkpointing and restarting.
It's in this case that you see a segfault on restart.
Is this correct?

rc2 and rc3 have known bugs in terms of supporting 32-bit mode.
After rc1, we changed our restart algorithm a little.  In the
next few updates this week, we're hoping to fix the 32-bit mode.

Thank you for the further details.  We'll especially
look into why ocaml should be more sensitive than R/python.

Best wishes,
- Gene

On Sun, Apr 26, 2015 at 05:07:40AM -0400, Calvin Ostrum wrote:
> Latest results on my experimentation here.  Short form: rc1 seems to
> work on Atom N550, but still not rc2 or rc3.
> 
> For some reason, finally seem to have gotten a saved checkpoint that
> works all the time on the Fedora 20 (kernel 3.17.7) i3 540, with rc2.
> Ran it about 50 times in a row with no segfault.
> 
> Also have tried building and running each of 2.3.1 and 2.4.0 rc1,
> rc2,rc3 on the Fedora 19 (kernel 3.14.23) Atom N550.
> 
> Results: 2.3.1 works like the packaged version, but that means
> control-c ends all checkpointed shells tested with (R,python,ocaml).
> 
> rc2 and rc3 (just put up for download hours ago) still crash on every
> checkpointed shell at the same place shown with strace.
> 
> However, *rc1* *works* each time I try on each of these shells, and
> handles control-C correctly with R and Python.  However, with ocaml,
> when one hits control-C, ocaml prints its "interrupted" message (so
> the signal does get to it fine) but then the whole thing quits.
> 
> Now, I noticed that ocaml is actually bytecode for the ocaml
> interpreter, run by the shell using a bang line.  That could easily
> mess things up, I suppose. So I tried a checkpoint running the ocaml
> bytecode interpeter directly, passing ocaml into it.  And... that
> seems to work.   Tried it many times in a row.   Runs okay (after
> loading in a huge ocaml program) and control-c works as it should.
> 
> So hopefully this is some kind of clue.  I assume rc1 works on this
> system.  But, rc2 and rc3 don't.  Without any understanding of what
> these programs truly do I don't know what other information I can
> provide but will try to provide more if told what I could gather.
> Maybe it is just something in the configure/make that differs?

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to