David Fraser wrote:
Jim Gallacher wrote:
David Fraser wrote:
Hi
I thought it would be good to take this across to python-dev. I've read
through
https://issues.apache.org/jira/browse/MODPYTHON-109?page=all
and the discussion in
http://www.modpython.org/pipermail/mod_python/2006-January/019865.html
http://www.modpython.org/pipermail/mod_python/2006-January/019866.html
and
http://www.modpython.org/pipermail/mod_python/2006-January/019870.html
again, and I'm just not sure about this.
Basically, Apache seems to provide some sort of mechanism for child
processes to clean themselves up, and for modules to clean up their
resources in a particular child.
The argument to remove the ability to clean up Python objects seems to
be that:
A) The finalize method was been called in an awkward place (from inside
a signal handler) and other code may be running and have the GIL, so it
may not be called at all, even in a graceful shutdown.
B) A normal restart will just send a TERM signal, which doesn't give
proper opportunity for cleanup
C) If the graceful shutdown doesn't work or respond quickly, Apache will
just kill the process anyway, so we may as will live with being killed
(talk about mixed metaphors...)
D) Since databases etc have to deal with the client process being
killed, they generally will handle this
I accept that problem A with the finalizing methods is a real problem,
but wonder if there are alternate solutions that can be provided to
allow cleanups to be attempted.
I don't think that B or C is a good argument - in that case why would
Apache be providing the hooks to clean up anyway? It feels like throwing
in the towel...
And D just seems impolite - if we can try and clean up we should.
Of course, if we can't manage to call finalize methods even in a
graceful shutdown none of this may be possible...
Trying to find relevant info on this from the Apache docs and other
module documentation:
http://httpd.apache.org/docs/2.2/stopping.html#gracefulstop
talks about advising children to exit after their current request. In
this case it would seem the cleanup methods should get called at the end
of the request processing, and thus shouldn't be in a signal handler
(and there should be no other Python code executing...)
Except that the parent "advises" it's children by sending a signal,
doesn't it?
On Unix it does, but I'm not sure about Win32.
I'm not sure about Win32 either, since it doesn't have any child
processes...
Anyway if the exit is not
actually not from the signal handler, but the signal handler is simply
flagging that an exit should be done after the current request, then the
cleanup could be done alongside the exit and outside of the signal
handler...
http://www.apachetutor.org/dev/pools
talks about using pools to allocate/deallocate resources other than
memory - could we provide a way to register Python objects that need
cleanup using this mechanism?
That *is* the mechanism that mod_python uses to register cleanups.
req.register_cleanup uses the request pool, and
apache.register_cleanup uses the server pool (child_init_pool).
Good then :-)
Am I barking up the wrong tree or is this worth investigating further?
David
It's worth investigating. There may be a solution, but we just can't
see it. I don't think anyone would argue that the current proposal to
drop the server cleanup is sub-optimal, but the current implementation
is worse than having no cleanup at all.
OK great that's reassuring. I forgot to mention in the above email the
mod_perl documentation that seems to indicate that mod_perl does this:
http://modperlbook.org/html/ch05_03.html
Interesting, in as much as it touches on the problem we are trying to
solve here. See section 5.3.2.
http://162.105.203.19/apache-doc/24.htm#BIN67
http://162.105.203.19/apache-doc/79.htm#BIN172
I've been reading this book, "Writing Apache Modules with Perl and C",
the last couple of days. :) It's a darn good yarn, even if I did figure
out who-done-it by the end of the first chapter. As I'm reading I keep
having a recurring fantasy... "wouldn't it be great to have this kind of
resource for mod_python"? I think I need to get out more. :)
What you need to realize is that mod_python is not doing anything
exotic. We are all playing in the same sandbox by the same rules imposed
by apache. Callbacks for things like child initialization and exit, or
any other phase get triggered the same way in any module. What we are
bumping into with this particular bug is a limitation of the python
interpreter, and the whole GIL problem.
Really though, isn't this whole discussion actually about database
connection pooling? Doesn't that cover 99% of the cases people care
about? If so maybe our energies would be better focused on what may be
required to support mod_dbd within mod_python.
Database connection pooling does seem a large amount of it, but we also
do other things from within Apache like launching separate index
processes or running things like Excel via COM. At the moment the
indexing process watches the parent process and exit when it does, but
it might be quite nice to be able to tell the child process it should
exit explicitly.
http://httpd.apache.org/docs/2.2/mod/mod_dbd.html
Although it may be awkward to use mod_dbd with its limited set of
database drivers and functionality, when there is the Python DB-API ...
I've looked at the mod_dbd documentation before - how do you even
execute a SQL statement and retrieve the rows? Maybe I'm missing how it
works...
But if we could get mod_dbd to manage Python DB-API connections and pool
them, now that would be cool as it would require minimal changes to
existing Python code...
I've only taken a cursory glance at mod_dbd and the underlying apr_dbd,
and only in a few stolen moments during the day today, but my gut tells
me that it may not be a simple as I had hoped this morning. I have a
feeling that we might actually need to write a python DB-API wrapper
around the apr_dbd_* calls. This would certainly be a non-trivial thing,
but would be kinda cool. Having this wrapper would allow us to add sql
database functionality (I'm thinking about the often discussed
SQL-Session subclass) without worrying about any particular database
dependency, and would likely be a real boon for mod_python.
My guess is we won't see a Python DB-API wrapper magically appearing
from outside of the mod_python community, so if we want it... start
hacking. :) It might be something to consider for 3.4. It could also
make a nice Google Summer of Code project for next year if nothing
happens in the interim.
Jim