On Wed, Aug 5, 2009 at 12:12 PM, Adam Langley <a...@chromium.org> wrote:

>
> Our child process reaping is a little bit of a hack right now, which
> is my fault. I didn't anticipate how bad it would turn out.
>
> Currently, we use a bunch of hacks to make sure that we reap all the
> children that we need to, but don't reap children from another part of
> the code etc. If we need to make sure that a process dies within x
> seconds, then we fork off a thread etc. process_util_posix.cc is scary
> complex and bodgy.
>
> Here's my plan:
>
> We have a singleton object in base which handles all forking and reaping:
>
> class ChildProcessReaper {
>  public:
>  // Has the same semantics as fork() - i.e. this function returns twice. In
>  // the parent process, the resulting child will be reaped on exit and the
>  // termination state will be saved. It can be accessed using
>  // GetTerminationState() below.
>  pid_t ForkAndRetainTerminationState(uint64_t* child_id);
>
>  // Same as ForkAndRetainTerminationState, but the child's termination
> state
>  // will not be retained on exit.
>  pid_t ForkAndForget(uint64_t* child_id);


Why do you need a child_id if you're going to forget about it anyway?


>  // Get the termination state of a child started with
>  // ForkAndRetainTerminationState(). If the child has terminated, then ths
>  // function will return true and *status is set to the value obtained from
>  // wait(2).
>  bool GetTerminationState(uint64_t child_id, int* status);
>
>  // Wait |seconds| seconds for the given child to terminate. If it doesn't
>  // terminate within that amount of time, send SIGKILL. Calling this
> function
>  // is a no-op if the child has already terminated. If the child was
> created
>  // with ForkAndRetainTerminationState, then calling this function means
> that
>  // the child's terminate state will no longer be retained.
>  void EnsureChildTerminates(float seconds, uint64_t child_id);
>
>
> This object is created in the IO thread of the browser. On Linux it'll
> create a signalfd to get SIGCHILD. On OS X it'll spawn off a
> PlatformThread which sits in wait(2) for child notifications. We can
> then ensure that we'll never get zombies.
>
> The only wrinkle is that we need to return both a pid_t and another id
> (uint64_t above). Consider the following situation:
>
> 1) Thread A forks off a child, pid $x
> 2) The child runs, terminates and is reaped
> 3) Thread B forks off a child. Because $x has been reaped the kernel
> is free to reuse that pid
>
> Now, both thread A and B can call EnsureChildTerminates with pid $x
> and kill the wrong child.
>
>
> Any thoughts?


Before Chrome I worked on an app that also forked off a LOT of processes and
had to keep track of their termination states.  We did the exact same thing
you did here and it worked well.  Only difference is that we did it in
Python.  :-)

J

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to