On Tue, 24 Apr 2001, Greg Ames wrote:
> [EMAIL PROTECTED] wrote:

> > Not writing to the pipe_of_death is just a bug.
>
> Agreed. Design bug or code bug?  (not that it matters)

code.

> >   which is why I am questioning that the algorithm is really wrong, I
> > believe it is a bug, not a logic error.
>
> what's the difference?   does anybody care?

There is a big difference.  a bug is a problem with the implementation, a
logic error is a problem in the design.

> > Can't this whole thing be solved by having the process that was woken up
> > by the pipe_of_death set workers_may_exit before releasing the poll lock?
>
> I know this patch works, even if the parent only sent one character on
> the pipe of death.
>
> What I haven't pondered very much is the loop sending a character per
> process.  I don't know if that's absolutely air tight.  It may be.
>
> > BTW, if the parent is signalling a graceful shutdown, the child process's
> > signal handler should never be awoken until all the other threads are
> > gone.
>
> hmmm...how can you tell for sure when all the worker threads are gone,
> unless you are the signal thread and you do a pthread join?

The easiest way, is to have the process itself RAISE the signal, instead
of making the parent process send the signal.  Yes, this gets the signal
to the child process before all the threads are gone, but it does keep the
parent out of the loop.

> > The problem I have with this, is that workers_may_exit is specifically a
> > local variable.
>
> yeah, but we're programmers, and we have the technology to change that.

No, you miss my meaning.  I'm not saying that currently workers_may_exit
is a local variable, I am saying that it should ALWAYS be a local
variable.  The whole point of the variable is that the child process
controls how it shuts itself down.  By exposing that variable, you are
giving the parent control over how the child shuts down.

Think of it this way, there are two stages to shutting down the child:

1)  notifying the child that it should shutdown
2)  actually shutting down.

Once the parent does #1, through the pipe of death, it is up to the child
to determine how best to shutdown.

> >                  The parent process should not be touching it.  It is up
> > to the process how it is going to go away.
>
> Which process decides whether the user wants a graceful shutdown vs.
> quick -n- dirty shutdown vs. a restart?

That is a part of the notifcation.  Workers_may_exit is not a part of the
notification, it is a part of the how.  Once the child process is told to
die, either gracefully or immediately, the parent should be completely out
of the loop.

What this patch is basically saying, is that the a threaded child process
does not know how to take itself down cleanly.  That is bogus.  It means
that when MaxRequestsPerChild is hit, we have to do somersaults to
shutdown cleanly.

> >   The fact that this patch moves
> > it to the parent's job to set workers_may_exit is incorrect.
>
> I respect your opinion, but disagree.  This patch works, and it has a
> really nice side benefit, as well.
>
> I think it's great that those death codes are in the scoreboard for
> admin reasons as well as for sealing up the timing windows.  mod_status
> could display them, and the admin would have a better view of what
> Apache is up to internally in our long running download scenarios, for
> example.  The ability for admins to see this kind of information will be
> important for the success of any threaded mpm.

What this patch does, is introduce yet another path for the child
processes to shutdown.  It separates how a child process dies for
MaxRequestsPerChild from how it dies for a graceful restart.  In one case,
the parent process sets a value in shared memory, in the other the threads
determine that they should die.

What should happen, is that in both cases, the child process is notified
(either by a byte from a pipe, or an if condition) that it should die, the
child process sets a local variable to inform the rest of the threads, and
the threads start to die.

By introducing the parent into the how a child process is going to die,
you have made things more complex, and added to the amount of shared
memory required.  And, I still haven't seen a good explanation of why the
current design won't work.  I do know that Manoj spent about three weeks
removing the race conditions from the original code.

Ryan

_______________________________________________________________________________
Ryan Bloom                              [EMAIL PROTECTED]
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Reply via email to