Re: ProcessReaper: single thread reaper

Peter Levart Thu, 17 Apr 2014 07:24:06 -0700

On 04/16/2014 03:18 PM, roger riggs wrote:

Hi,


Another approach was suggested by a member of the Solaris team.

If you open /proc/pid O_RDONLY for any process you wish to monitor
and use poll(2), you can wait for a hangup event which indicates that
the process has exited.  You can then reap that process's status w/
waitpid.  You'll also want to wait on a pipe; as you fork additional
processes you can write their pid to the pipe and the monitoring thread
can wake up, read the pipe and add that fd to the pollfds.  This will
work for any version of Solaris you support, and it uses a fd per
process; using the pipe mechanism no locking is required.

I haven't had a chance to followup but it is interesting thatmonitoring /proc

can give an indication of Process termination and some aspects should
work on Linux, not just Solaris.

Roger


Hi Roger,

An interesting idea. Another way to dynamically augment the list of fdsthe single thread is poll()ing is to use a signal to interrupt poll(),but using a pipe might be less hassle (thinking of setting up per-threadsignal masks, ...).

So I think there's one more reason to create an internal API just forthat purpose (waiting on child exits, reaping their exit statuses anddispatching events to Java threads) with multiple implementationsselectable by system property. The API could be as simple as:



interface ProcessReaper {
    // called from Process API after spawning new child.
    // returns exit status futureand registers the child
    // to be reaped.
    CompletableFuture<Integer> processStarted(int pid);

    // optional - returns exit status future of a child process
    // spawned by Process API which has not been reaped yet,
    // null otherwise
    CompletableFuture<Integer> getExitStatus(int pid);
}


Regards, Peter



On 4/14/2014 5:02 AM, Peter Levart wrote:

Hi Martin, Roger,

Just a thought. Would it be feasible to have two (ore more) built-instrategies, selectable by system property? A backwards compatibletread per child, using waitpid(pid, ...), a single reaper threadusing waitpid(-1, ...), maybe also single threaded strategyaccessible only on Linux/Solaris using waitid(-1, ..., WNOWAIT)...All packed nicely in a package-private interface (ProcessReaper) withmultiple implementations?


Regards, Peter

On 04/12/2014 01:37 AM, Martin Buchholz wrote:

Let's step back again and try to check our goals...

We could try to optimize the one-reaper-thread-per-subprocess thing.But that is risky, and the cost of what we're doing today is notthat high.

We could try to implement the feature of killing off an entiresubprocess tree. But historically, any kind of behavior change likethat has been vetoed. I have tried and failed to make lessincompatible changes. We would have to add a new API.

The reality is that Java does not give you real access to theunderlying OS, and unless there's a seriously heterodox attempt toprovide OS-specific extensions, people will have to continue toeither write native code or delegate to an OS-savvy subprocess likea perl script.

On Fri, Apr 11, 2014 at 7:52 AM, Peter Levart<[email protected] <mailto:[email protected]>> wrote:


    On 04/09/2014 07:02 PM, Martin Buchholz wrote:




    On Tue, Apr 8, 2014 at 11:08 PM, Peter Levart
    <[email protected] <mailto:[email protected]>> wrote:

        Hi Martin,

        As you might have seen in my later reply to Roger, there's
        still hope on that front: setpgid() + wait(-pgid, ...) might
        be the answer. I'm exploring in that direction. Shells are
        doing it, so why can't JDK?

        It's a little trickier for Process API, since I imagine that
        shells form a group of processes from a pipeline which is
        known in-advance while Process API will have to add
        processes to the live group dynamically. So some races will
        have to be resolved, but I think it's doable.


    This is a clever idea, and it's arguably better to design
    subprocesses so they live in separate process groups (emacs does
    that), but:
    Every time you create a process group, you change the effect of
    a user signal like Ctrl-C, since it's sent to only one group.
    Maybe propagate signals to the subprocess group?  It's starting
    to get complicated...


    Hi Martin,

    Yes, shells send Ctrl-C (SIGINT) and other signals initiated by
    terminal to a (foreground) process group. A process group is
    formed from a pipeline of interconnected processes. Each pipeline
    is considered to be a separate "job", hence shells call this
    feature "job-control". Child processes by default inherit process
    group from it's parent, so children born with Process API (and
    their children) inherit the process group from the JVM process.
    Considering the intentions of shell job-controll, is propagating
    SIGTERM/SIGINT/SIGTSTP/SIGCONT signals to children spawned by
    Process API desirable? If so, then yes, handling those signals in
    JVM and propagating them to current process group that contains
    all children spawned by Process API and their descendants would
    have to be performed by JVM. That problem would certainly have to
    be addressed. But let's first see what I found out about
    sigaction(SIGCHLD, ...), setpgid(pid, pgid), waitpid(-pgid, ...),
    etc...

    waitpid(-pgid, ...) alone seems to not be enough for our task.
    Mainly because a process can re-assign it's group and join some
    other group. I don't know if this is a situation that occurs in
    real world, but imagine if we have one live child process in a
    process group pgid1 and no unwaited exited children. If we issue:

        waitpid(-pgid1, &status, 0);

    Then this call blocks, because at the time it was given, there
    were >0 child processes in the pgid1 group and none of them has
    exited yet. Now if this one child process changes it's process
    group with:

        setpgid(0, pgid2);

    Then the waitpid call in the parent does not return (maybe this
    is a bug in Linux?) although there are no more live child
    processes in the pgid1 group any more. Even when this child
    exits, the call to waitpid does not return, since this child is
    not in the group we are waiting for when it exits. If all our
    children "escape" the group in such way, the tread doing waiting
    will never unblock. To solve this, we can employ signal handlers.
    In a signal handler for SIGCHLD signal we can invoke:

        waitpid(-pgid1, &status, WNOHANG); // non-blocking call

    ...in loop until it either returns (0) which means that there're
    no more unwaited exited children in the group at the momen or
    (-1) with errno == ECHILD, which means that there're no more
    children in the queried group any more - the group does not exist
    any more. Since signal handler is invoked whith SIGCHLD being
    masked and there is one bit of pending signal state in the
    kernel, no child exit can be "skipped" this way. Unless the child
    "escapes" by changing it's group. I don't know of a plausible
    reason for a program to change it's process group. If a program
    executing as JVM child wants to become a background daemon it
    usually behaves as follows:

    - fork()s a grand-child and then exit()s (so we get notified via
    signal and waitpid(-pgid, ...) successfully for it's exitstatus)
    - the grand-child then changes it's session and group (becomes
    session and group leader), closes file descriptors, etc. The
    responsibility for waiting on the grand-child daemon is
    transferred to the init process (pid=1) since the grand-child
    becomes an orphan (has no parent).

    Ignoring this still unsolved problem of possible ill-behaved
    child program that changes it's process group, I started
    constructing a proof-of-concept prototype. What I will do in the
    prototype is start throwing IllegalStateException from the
    methods of the Process API that pertain to such children. I think
    this is reasonable.

    Stay tuned,

    Peter

Re: ProcessReaper: single thread reaper

Reply via email to