Nic Ferrier wrote:
Jim Gallacher <[EMAIL PROTECTED]> writes:
Here are some further things to consider if anyone wants to persue it.
Consider the following code:
import logging
from mod_python import apache
from proposed_mp_logging_module import ApacheLogHandler
def handler(req)
req.content_type = 'text/plain'
log = logging.getLogger('request')
hdlr = ApacheLogHandler(req)
log.addHandler(hdlr)
log.setLevel(logging.INFO)
msg = 'Its all good'
log.info('%s' % msg)
req.write('check the logs for "%s"' % msg)
apache.OK
All fine and dandy. But isn't logger a singleton? So each time a request
is processed we'll be adding a reference to the request object, which
will never get garbage collected and result in a memory leak.
There might be a fault with my design here. To help me understand what
you're saying can you confirm that when you say:
"But isn't logger a singleton"
do you mean:
"But isn't 'log' a singleton?" (ie: 'log' refers to the instance
variable in your example)
I meant log but perhaps should have said logging. All log handlers are
stored in the module level variable logging.root. Take a look at
logging.__init__.py.
Furthermore, you can't depend on the request object being valid once
the request processing has completed. At some point request_tp_clear (in
requestobject.c) will get called and request->server will be set to
NULL. (At least I think this is correct). My gut tells me you'll get
some random seg faults.
My gut was right. Your current version segfaults (using mpm-prefork). It
might work if you register a cleanup to remove the reference to your
logging handler (untested):
req.register_cleanup(log.removeHandler, hdlr)
The problem here is that we are depending on the user to add that line
to their code to avoid the segfault. I don't like the idea of including
code in mod_python which may result in a segfault, even if the user is
warned of the dangers in the docs.
There are only two ways this can happen I think:
1. the user saves the logger for later (maybe in a thread) and tries
to log after the request has finished
This would be user error and therefore acceptable IMHO.
I should point out that this is unlikely to happen since loggers
tend to be very local objects with small scopes.
Loggers have module level scope. When you add a handler you are adding
it to logging.root. In a given process, each call to
log.addHandler(hdlr) adds another ApacheLogHandler instance with it's
associated request object. Thus you would expect that when you try to
log an error you would see multiple messages in the apache log, but
mod_python actually segfaults instead, since most request objects
(except the current one) are likely invalid. Thus the need to remove the
handler at the end of the request as mentioned above.
2. the logging record doesn't get flushed before the request has
completed so that when it is flushed the correct state is no longer
available.
This is also possible, clearly it would be a bug because the
logging record should be flushed immediately (if that's the
implementation of the handler)
The problem is not with the current request. That will be ok as long as
you are logging your message prior to the end of the request processing
phase. The problem is with the references to old requests being held by
logging.root.
Also, is there not an assumption that the logger instance is in effect
global? So let's say you change the level in one request such as
log.setLevel(logging.CRITICAL). In mpm-prefork or mpm-worker models
changing the log level will not propagate to the other child processes.
I expect most users will find this confusing.
Not sure why you think that.
Which part? - That users will want to change the log level or that
they'll get confused? ;) Or the global part?
If mod_python included a module based on my design there would be a
clear understanding that it was specifically for request derived
logging (the most common case).
It's particularly clear from my design because it takes a request as
an argument. You couldn't create one of my log handlers globally for
example.
It wasn't clear to me. The request argument for
ApacheLogHandler._init__() is optional and when it's missing you create
an instance of ProxyLogThing. ProxyLogThing.log_error calls
apache.log_error which suggests to me this could be be used outside of a
request.
Also the natural thing to do with logging is to have one per unit of
code.
As I said before I don't think this is as trivial to implement as it
might seem at first blush.
I think it has all of the same problems as any other multi-programming
(threads or processes) code. It's true that a lot of users don't
understand those issues but that's not exclusive to logging.
Yep. It sure messes me up, and trying to do things inside of apache just
makes everything more complicated.
Just to be clear Nic, I'm not fighting you on the *idea* of including
your ApacheLogHandler - I may even be in favour of it. I'm just pointing
out some potential problems. I have a few more up my sleeve but I want
to test them before commenting.
Regards,
Jim