[modwsgi] Graceful daemon process restarts.

Graham Dumpleton Thu, 15 Apr 2010 23:32:35 -0700

When mod_wsgi daemon process are recycled due to maximum requests,
when cpu time limit is exceeded, or when triggered through a signal
sent by a user, it can be quite brutal. This is because the sequence
is as follows:


  Stop accepting new requests.

  Start a shutdown timer. This defaults to 5 seconds.

  Wait for any active requests to complete.

  If active requests complete before shutdown timer expires, then exit
immediately.

  If shutdown timer expires, exit immediately anyway, interrupting the
active requests.

Important to note is that no new requests are accepted. This means
that if you only have a single process, even if multithreaded, and you
have a long running request active, the site might stall for 5 seconds
until it is decided to terminate the process anyway. This delay is why
it is important if using maximum-requests option to WSGIDaemonProcess
that you use more than one process to limit the chance of the site
stalling. That is, while one process is being recycled, other
processes in the daemon process group would still be accepting
requests, presuming of course they aren't coincidently restarting at
the same time.

To lesson the consequences of this I have add a new feature into
mod_wsgi subversion trunk for 4.0 for people to experiment with and
provide feedback.

This new feature is the introduction of a grace period which will
occur prior to the shutdown sequence above for recycling of mod_wsgi
daemon processes.

What happens is that during the grace period, the process will still
be exited immediately if all active requests complete, but because new
requests are still accepted, the site will not appear to stall.

Obviously if the site is being hammered and is never idle, then it
will still not get a chance to shutdown cleanly and instead after the
grace period expires the normal shutdown sequence will apply, with new
requests locked out and one final chance given for any active requests
to shutdown before process is forcibly exited.

To enable this new feature, the graceful-timeout option should be
specified to WSGIDaemonProcess in conjunction with the
maximum-requests or cpu-time-limit options.

Because both of these options effectively represent voluntary process
recycling and exactly when it occurs is not too critical, you can
specify the grace period to be as long as you want. Obviously, make it
too long and you may exacerbate any condition you are trying to limit
through using those options in the first place. Ie., memory leaks and
or code gone out of control in a tight loop.

An example may therefore be:

  WSGIDaemonProcess memory-leaking-application maximum-requests=10000
graceful-timeout=30

The argument to graceful-timeout is number of seconds. Thus for this
example, after 10000 requests have been processed, will enter a grace
period of 30 seconds. Requests are still accepted, but if process
becomes idle within that time it will shutdown and new one started in
its place. If 30 seconds expires, then new requests will not be
accepted. If active requests all complete then again shutdown
immediately, but if 5 seconds elapses (default shutdown-timeout), then
process is forcibly exited.

As well as applying to restarts due to maximum-requests and
cpu-time-limit options, it is also now possible to send the Apache
graceful restart signal to the daemon processes. This signal is
usually SIGUSR1 on UNIX systems.

So, instead of sending a SIGINT or SIGTERM to daemon processes, which
still forces a normal shutdown per shutdown sequence listed above,
send a SIGUSR1 and it will apply this grace period first.

These options therefore give you a way of restarting processes in a
way which minimally impacts on active requests being handled by the
daemon process.

Note that this feature, and in particular SIGUSR1, shouldn't be relied
upon as a good way of restarting daemon processes when code changes
have been made and you are running more than a single daemon process.

This is because you will not know when a process is restarted and you
could end up with processes running at the same time which are using
different code. This could cause issues if requests for same user
session are handled by different processes and return incompatible
responses.

As such, when making code changes and running with multiple processes,
you should still use the feature whereby the WSGI script file is
touched to force daemon processes to restart. This is because that
will guarantee that all new requests use the same code by virtue of
fact that it uses the normal shutdown process above. That does however
mean that process could stall and active requests more likely to be
interrupted if it is a long running requests, but that is the trade
off.

Also note that the graceful shutdown doesn't apply when Apache itself
is being restarted, whether that be a normal Apache restart or a
graceful restart. It is simply not possible to do a graceful restart
of daemon processes in the latter case because of the way that Apache
treats child processes which aren't its own server child processes.
:-(

Now, if you have understood all this and are using mod_wsgi from
subversion trunk, then please experiment and let me know if you
experience problems. I still need to do more review of the code to
make sure I got it right, but on simple checks it seems to do what it
was intended to do.

Enjoy.

Graham

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

[modwsgi] Graceful daemon process restarts.

Reply via email to