> Let's say we just want to shut down. Assume one thread in one process
> is sitting in apr_pool, waiting for an accept() or the pipe of death.
> That process is pretty easy to deal with, as long as you remember to
> write to the pipe of death (which we aren't now with the CVS code,
> btw).
Not writing to the pipe_of_death is just a bug.
> But the other processes which are blocked in mutex land are trickier.
> The pipe of death doesn't do anything directly to unblock the mutexes,
> and neither does the signal. However, since the apr_poll process is
> going to be good, wake up, and start to go away, it will eventually
> release the accept mutex to another process. Before this patch, it was
> just looking at workers_may_exit, local to the process. So if the
> signal thread hasn't been dispatched yet, it will just blow past that
This is the problem with the current logic. Remember, that Manoj and I
spent a LOT of time getting this right, and it worked for a very long
time, which is why I am questioning that the algorithm is really wrong, I
believe it is a bug, not a logic error.
Can't this whole thing be solved by having the process that was woken up
by the pipe_of_death set workers_may_exit before releasing the poll lock?
BTW, if the parent is signalling a graceful shutdown, the child process's
signal handler should never be awoken until all the other threads are
gone.
> test and think it should keep on working, and fall into the apr_poll
> forever (unless Netscape/IE5/etc help us out by sending in new
> connections). Remember we had to do the pipe of death earlier to wake
> up the process which first owned the accept mutex.
>
> Having the parent set the new life_status field in the scoreboard, and
> using
> that as workers_may_exit eliminates the race and the dependency on
> enough browsers pounding the server in order for it to shut down.
The problem I have with this, is that workers_may_exit is specifically a
local variable. The parent process should not be touching it. It is up
to the process how it is going to go away. The fact that this patch moves
it to the parent's job to set workers_may_exit is incorrect.
Ryan
_______________________________________________________________________________
Ryan Bloom [EMAIL PROTECTED]
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------