Nic Ferrier wrote:
Jim Gallacher <[EMAIL PROTECTED]> writes:
Here are some further things to consider if anyone wants to persue it. Consider the following code:

import logging
from mod_python import apache
from proposed_mp_logging_module import ApacheLogHandler

def handler(req)
    req.content_type = 'text/plain'
    log = logging.getLogger('request')
    hdlr = ApacheLogHandler(req)
    log.addHandler(hdlr)
    log.setLevel(logging.INFO)
    msg =  'Its all good'
    log.info('%s' % msg)
    req.write('check the logs for "%s"' % msg)
    apache.OK

All fine and dandy. But isn't logger a singleton? So each time a request is processed we'll be adding a reference to the request object, which will never get garbage collected and result in a memory leak.


There might be a fault with my design here. To help me understand what
you're saying can you confirm that when you say:

"But isn't logger a singleton"
do you mean:

  "But isn't 'log' a singleton?" (ie: 'log' refers to the instance
  variable in your example)

I meant log but perhaps should have said logging. All log handlers are stored in the module level variable logging.root. Take a look at logging.__init__.py.


Furthermore, you can't depend on the request object being valid once the request processing has completed. At some point request_tp_clear (in requestobject.c) will get called and request->server will be set to NULL. (At least I think this is correct). My gut tells me you'll get some random seg faults.

My gut was right. Your current version segfaults (using mpm-prefork). It might work if you register a cleanup to remove the reference to your logging handler (untested):

req.register_cleanup(log.removeHandler, hdlr)

The problem here is that we are depending on the user to add that line to their code to avoid the segfault. I don't like the idea of including code in mod_python which may result in a segfault, even if the user is warned of the dangers in the docs.


There are only two ways this can happen I think:

1. the user saves the logger for later (maybe in a thread) and tries
   to log after the request has finished

   This would be user error and therefore acceptable IMHO.

   I should point out that this is unlikely to happen since loggers
   tend to be very local objects with small scopes.

Loggers have module level scope. When you add a handler you are adding it to logging.root. In a given process, each call to log.addHandler(hdlr) adds another ApacheLogHandler instance with it's associated request object. Thus you would expect that when you try to log an error you would see multiple messages in the apache log, but mod_python actually segfaults instead, since most request objects (except the current one) are likely invalid. Thus the need to remove the handler at the end of the request as mentioned above.

2. the logging record doesn't get flushed before the request has
   completed so that when it is flushed the correct state is no longer
   available.

   This is also possible, clearly it would be a bug because the
   logging record should be flushed immediately (if that's the
   implementation of the handler)

The problem is not with the current request. That will be ok as long as you are logging your message prior to the end of the request processing phase. The problem is with the references to old requests being held by logging.root.



Also, is there not an assumption that the logger instance is in effect global? So let's say you change the level in one request such as log.setLevel(logging.CRITICAL). In mpm-prefork or mpm-worker models changing the log level will not propagate to the other child processes. I expect most users will find this confusing.


Not sure why you think that.

Which part? - That users will want to change the log level or that they'll get confused? ;) Or the global part?


If mod_python included a module based on my design there would be a
clear understanding that it was specifically for request derived
logging (the most common case).

It's particularly clear from my design because it takes a request as
an argument. You couldn't create one of my log handlers globally for
example.

It wasn't clear to me. The request argument for ApacheLogHandler._init__() is optional and when it's missing you create an instance of ProxyLogThing. ProxyLogThing.log_error calls apache.log_error which suggests to me this could be be used outside of a request.

Also the natural thing to do with logging is to have one per unit of
code.


As I said before I don't think this is as trivial to implement as it might seem at first blush.


I think it has all of the same problems as any other multi-programming
(threads or processes) code. It's true that a lot of users don't
understand those issues but that's not exclusive to logging.

Yep. It sure messes me up, and trying to do things inside of apache just makes everything more complicated.

Just to be clear Nic, I'm not fighting you on the *idea* of including your ApacheLogHandler - I may even be in favour of it. I'm just pointing out some potential problems. I have a few more up my sleeve but I want to test them before commenting.

Regards,
Jim

Reply via email to