[ 
https://issues.apache.org/jira/browse/MESOS-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880879#comment-13880879
 ] 

Nikita Vetoshkin commented on MESOS-943:
----------------------------------------

I think it could a good thing to mention in comments that there are several 
ways to deal with subprocess management in Linux(UNIX) environment. And clarify 
which was chosen and way.

1. Everyone is managing it's subprocesses itself. I.e. {{fork()+exec()}} and 
calling {{wait()}} either in blocking fashion or looping {{wait() + WNOHANG}}.
This approach is okay in multithreaded environment, when each component uses 
separate thread to manage subprocess status. But it doesn't scale with a large 
number of children. Don't forget that someone must read stdout/stderr of each 
of the children.
2. Enhanced n.1 approach - reaper thread, shared between components, watching 
only for specific children looping through the list using {{waitpid + WNOHANG}}.
That is the currently implemented in reviews one. It's a tradeoff - better 
scaling (one thread watching children) and components are free too manage 
subprocesses themselves, but still there's a busyloop.
3. Global reaper which can use either blocking {{wait()}} or work via 
{{SIGCHILD}} handler writing to a pipe, when reaper thread is sleeping on a 
pipe and calling {{wait() + WNOHANG}} upon waking up.
It's a scalable way, good for asynchronous environment. It's the way 
subprocessing is implemented in Qt and (I think) in libev itself. But it has 
one major drawback - all components MUST manage subprocesses through this 
reaper.

> Provide an abstraction for asynchronous launching of subprocesses.
> ------------------------------------------------------------------
>
>                 Key: MESOS-943
>                 URL: https://issues.apache.org/jira/browse/MESOS-943
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>
> This has come up during [~idownes] changes to add containerization.
> We would like to be able to run commands asynchronously like:
> {{curl -O http://foo.com/bigfile.zip}}
> Currently, there is not an easy way to do this while having:
> 1. A Future handle on the exit status of the subprocess.
> 2. The means to 'discard' the future and consequently kill the subprocess 
> (e.g. stalled hadoop command).
> 3. Handles to stdin, stdout, stderr of the subprocess.
> The first issue is that we need to re-work the Reaper to not reap _all_ 
> subprocesses. Rather, we need to allow other components to reap their own 
> forked subprocesses without the slave's Reaper "stealing" the exit status 
> information. I've proposed that we move the Reaper into libprocess initially 
> with the only change being to reap the desired pids. (We can optimize this 
> later using a per-pid blocking thread or SIGCHLD).
> One concern is that if we 'leak' child processes by accidentally not reaping, 
> we may fill the process table with zombie processes. However, we have tight 
> control over where our code performs forks, and can enforce proper reaping.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to