#715: Parents probably not reclaimed due to too much caching
-------------------------------------------------------------------+--------
Reporter: robertwb |
Owner: somebody
Type: defect |
Status: needs_review
Priority: major |
Milestone: sage-5.4
Component: coercion |
Resolution:
Keywords: weak cache coercion Cernay2012 | Work
issues:
Report Upstream: N/A |
Reviewers: Jean-Pierre Flori, Simon King, Nils Bruin
Authors: Simon King, Jean-Pierre Flori | Merged
in:
Dependencies: #9138, #11900, #11599, to be merged with #11521 |
Stopgaps:
-------------------------------------------------------------------+--------
Comment (by nbruin):
Replying to [comment:256 SimonKing]:
> There is only one line of code between the executed and the not-executed
print statements: The last line of `_subprocess`' "finally:" clause,
namely
> {{{
> os._exit(0)
> }}}
> Question to the experts: What could possible go wrong in `os._exit(0)`?
Oh dear. That sounds like `_subprocess` is not returning at all! Let's see
what the documentation says:
{{{
os._exit(n)
Exit the process with status n, without calling cleanup handlers,
flushing stdio buffers, etc.
}}}
Could it be we found a bug in the OSX kernel? A system call that doesn't
return?
More seriously, it seems rather reassuring that the statement that comes
after you tell the process to quit, doesn't get executed. It seems to me
you've just ruled out it's not the child that SEGV-ing -- it's the parent.
In fact, we could have known that. In the doctest of
`sage.parallel.decorate.Fork` there is an explicit test that shows a child
can segfault with no detrimental effect (If you instrument `sage-doctest`
to not hide stderr, it's scary to see the backtrace come by, but the
doctest passes without problem). The fact that the doctest framework can
get its hand on the "11" exit code shows it's the parent that generates
it. Why do you think this happens due to parallel at all? Under gdb, the
test does not segfault, so you're looking at different behaviour. I don't
think parallel is implicated in this at all.
Really, ''strip away the doctesting layer''! If you read `sage-doctest`,
you'll see it produces a straight python file that it then executes
straight using python, with IO all redirected. Get that file and run it
directly, without redirecting IO. Setting `verbose` doesn't just change
the IO redirection in `sage-doctest`. It also gets written into that file
and hence can influence behaviour there. So with `sage -t` and `sage -t
--verbose` you're really running different code. You want the code that
`sage -t` generates with the IO redirection that `sage -t -verbose` does.
At that point you might as well just get `sage -t` out of the way
completely.
If you want to help people in the future, patch `sage -t` to have a flag
`--keep`, to not throw away any of the temporary files it produces, so
that you can pick through the remainders.
--------
Using `os.exit` versus `os._exit`: I can see why one might have thought
that's a good idea. We got what we came for (the function got executed and
the result is stored in an `.sobj` -- this should really be communicated
via a pipe to the parent, not via a temporary file), so why risk fouling
it up by doing more just to exit? However, if someone uses this for side-
effects (write to some shared file) it could be the buffers don't get
flushed. On the other hand, code is executing in parallel here (that's the
point), so one would probably already run into problems.
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/715#comment:257>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.