How much work would it take to have true graceful restarts for the mod_wsgi
daemon processes?

Current behavior:
When "apache2ctl graceful" aka "httpd -k graceful" runs, the Apache parent
process sends a SIGTERM to each mod_wsgi daemon process, waits up to 3
seconds (hardcoded maximum), and sends a SIGKILL to any that are still
alive. After they're all dead, it spawns new wsgi processes. This is
mentioned in various issues like #383
<https://github.com/GrahamDumpleton/mod_wsgi/issues/383> and #124
<https://github.com/GrahamDumpleton/mod_wsgi/issues/124>, and in the
documentation of WSGIDaemonProcess shutdown-timeout
<https://modwsgi.readthedocs.io/en/master/configuration-directives/WSGIDaemonProcess.html#:~:text=shutdown%2Dtimeout>
.
In contrast, Apache sends SIGUSR1 to its own worker processes, and whenever
one of them exits, Apache spawns a new one. So there should almost always
be enough processes ready to serve new connections. (
https://httpd.apache.org/docs/2.4/stopping.html#graceful)

My wishlist for "true" graceful restarts would be:
1. Make the shutdown timeout configurable.
2. Don't wait until *all* old daemon processes exit. Either spawn 1 new
process whenever 1 old process exits, or spawn all N new processes
immediately and let the old processes exit when they want.
3. Add another signal between the SIGTERM and SIGKILL which throws a Python
exception, so that "finally:" blocks have a chance to run.

Current code:
The linked github issues did mention that this behavior is hardcoded deep
in Apache and there is nothing mod_wsgi can do, but I wanted to see it
myself.
Actually, the logic is not anywhere in https://github.com/apache/httpd (in
particular, it's NOT server/mpm_unix.c
<https://github.com/apache/httpd/blob/trunk/server/mpm_unix.c>), but in
https://github.com/apache/apr. Specifically the SIGKILL is sent at
apr/memory/unix/apr_pools.c#L2810
<https://github.com/apache/apr/blob/39c271bca156adee03ff49f864dcce27ae6f5d73/memory/unix/apr_pools.c#L2810>
and
the 3 seconds timeout is hardcoded at apr/memory/unix/apr_pools.c#L98
<https://github.com/apache/apr/blob/39c271bca156adee03ff49f864dcce27ae6f5d73/memory/unix/apr_pools.c#L98>.
Any subprocess registered with apr_pool_note_subprocess(...,
APR_KILL_AFTER_TIMEOUT) will use that timeout. mod_wsgi calls that function
at server/mod_wsgi.c#L10566
<https://github.com/GrahamDumpleton/mod_wsgi/blob/dabb377a29cba190c6c48659e3f81df685e47aad/src/server/mod_wsgi.c#L10566>
.
The pool where the subprocesses are registered is the pconf pool given to
wsgi_hook_init. I guess they are probably killed when Apache
calls apr_pool_clear(process->pconf) in reset_process_pconf() in main.c,
but I haven't verified this.
The normal worker process logic is implemented in each mpm. E.g. prefork
replaces dead children with new live children at
server/mpm/prefork/prefork.c#L1145
<https://github.com/apache/httpd/blob/6596870481dc1f0e28ac59c52455691fee9c8524/server/mpm/prefork/prefork.c#L1145>,
I think.

My thoughts: (please correct me if I'm wrong)
This seems pretty hard. I definitely see why it wasn't done yet. And maybe
it's not worth the complexity even if it is possible.
Originally I hoped I could just write an Apache patch to replace the
hardcoded timeout value with a config file option. But the logic is in a
library (apr) so I can't read Apache config directly, and there might be
API/ABI concerns with extending apr_pool_note_subprocess(). And anyway,
*only* making the timeout configurable wouldn't be enough because the
server would just wait without any mod_wsgi process accepting new
connections.
I think the best chance of success would be to stop using apr_pool_t and
apr_pool_note_subprocess() for process management in mod_wsgi. After all,
it's not the only way: Either use fork() etc directly, like the mpm
modules, or at least, keep apr_pool_t but use our own custom pool rather
than "pconf" - most likely saved with ap_retained_data_get(). That way
mod_wsgi would have more control. When it learns the server is gracefully
restarting, it will spawn new daemon processes immediately with a new
socket name, and timeout/kill the old processes later in the background.
When it learns the server is stopping, it will block until the children are
terminated.

Does this make sense? Are there any glaring issues I've overlooked?

If the strategy sounds sensible, and if I have enough time, I might try to
code this. Is it something you would be potentially interested in merging?
(not too much code review burden, maintenance burden, or risk of new bugs)

Just for completeness, the backstory of why I want this:
My Python app writes files to disk. Sadly, some requests take more than 3
seconds. If it is killed with SIGKILL, the file buffer data is not written,
resulting in a corrupted empty/truncated file. A later batch process fails
when it tries to read every file in the output directory. I know there are
many workarounds, such as using a temporary file and atomically renaming
it, but I became curious about the root cause.
The server gracefully restarts every day because of log rotation, using
Ubuntu's default logrotate config. After reading #383
<https://github.com/GrahamDumpleton/mod_wsgi/issues/383> I also looked at
Apache's rotatelogs
<https://httpd.apache.org/docs/2.4/programs/rotatelogs.html>, but it
doesn't support compression, so I'd rather stay with logrotate.

Versions: Apache 2.4.41 with mpm_prefork, mod_wsgi 4.6.8 in daemon mode,
Python 3.8.10, Ubuntu 20.04. (old but I don't think this matters)

Tomi

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/CACUV5oemMwr1YzKe%3D0JrBTma%2BwQcvyaN5Jzc5uz_Kf31mK12ng%40mail.gmail.com.

Reply via email to