On 04/09/2014 03:20 PM, roger riggs wrote:
Hi Peter,
On a related topic, the request to be able to destroy a Process and
all of its children
might also want to used the group pid to be able to identify all of
the children.
Hi Roger,
This would require each child spawned by Process API to be assigned it's
own process group. The grandchildren would inherit this process group.
You could then send KILL/TERM signal to a process group in order to
destroy the child and all it's descendants (that did not change the
process group in the meanwhile).
But we can only group processes for one purpose, since a process can
only belong to one group at a time. To send signals to a (sub-)tree of
processes, the child-parent relationship is more natural to follow, I
think, since no waiting/blocking is involved in sending the signals, so
enumerating and iterating is appropriate.
Waiting on children is another purpose where process group(s) could be
employed and I think they would be better spent this way.
I think I now have a picture of how this could work. See my reply to Martin.
Regards, Peter
Roger
On 4/9/2014 2:08 AM, Peter Levart wrote:
Hi Martin,
As you might have seen in my later reply to Roger, there's still hope
on that front: setpgid() + wait(-pgid, ...) might be the answer. I'm
exploring in that direction. Shells are doing it, so why can't JDK?
It's a little trickier for Process API, since I imagine that shells
form a group of processes from a pipeline which is known in-advance
while Process API will have to add processes to the live group
dynamically. So some races will have to be resolved, but I think it's
doable.
Stay tuned.
Regards, Peter
On 04/08/2014 07:48 PM, Martin Buchholz wrote:
Peter, thank you very much for your deep analysis.
TIL and am horrified: signals on Unix are not queued, not even if
you specify SA_SIGINFO. Providing siginfo turns signals into proper
"messages" each with unique content, and it is unacceptable to
simply drop some (Especially when proper queueing seems required for
so-called real-time signals), but at least the Linux kernel does so
very deliberately. 45 years later, we are still fighting with
unreliable Unix signals...
We can't call waitpid(WAIT_ANY, ) because we can only wait for
processes owned by the j.l.Process subsystem. We can't override
libc functions like waitpid because the JVM may be a "guest" in some
other process.
I don't know of any public examples, but it is reasonable to add a
JVM to a previously pure native code application, similarly to the
way tcl or lua is often used to provide a higher-level safer
programming api to native code, and some programs at Google use this
strategy.
What problem are we actually trying to solve? The army of reaper
threads is ugly, but the inefficiency is greatly mitigated by the
use of small explicit stack sizes. Redoing the process code is
always risky, as we have already seen in this thread.
Maintaining a single child helper process which spawns all the
(grand)child processes seems reasonable, although it would create a
permanent intermediate entry in the process table (pstree?) which
might confuse some sysadmin scripts. Is it worth it?