Re: worker graceful restart warning message

Greg Ames Mon, 15 Apr 2002 12:07:56 -0700

Brian Pane wrote:
> 
> I'm seeing a race condition in which the worker MPM logs the
> "long lost child came home!" warning message.  The test case
> is:
>   - run "ab -c5" to create a steady load on the httpd
>   - while it's running, do a graceful restart.
> This will sometimes yield the "long lost child" message.
> 
> I added a bit of diagnostic logging and found that the order
> of events looks like this:
>   - child process for scoreboard slot X finishes its work and exits
>   - parent process forks a new child process and assigns it
>     to scoreboard slot X
>   - parent process notices that the first child process has
>     exited, looks for its pid in scoreboard, and doesn't find it
> 
> 
> Is this a harmless (and expected) warning case, or cause for
> alarm?


I think it's harmless.  If there are no scoreboard slots available that are
completely empty, one new process is allowed to "squat" on a scoreboard slot for
a process that is quiescing and has some unused thread slots.  Look at how
perform_idle_server_maintenance manipulates the free_slots array, starting
around line 1417 in worker.c.

Assuming the "squatting" scenario happens, the new process could overwrite the
pid field(s) in the scoreboard.  Then when the SIGCHILD logic in the parent
kicks in, it may not be able to find the dying process's pid in the scoreboard.

Bumping up ServerLimit to be significantly bigger than the number of processes
allowed by MaxClients will reduce the likelihood of scoreboard squatting.

Could we do something to make the messages go away?  This is software, so the
only safe answer is "maybe".  I don't remember enough of the details of how this
all works to say if it would be easy.

Greg

Greg

Re: worker graceful restart warning message

Reply via email to