On Jun 14, 2013, at 3:43 PM, Glyph <gl...@twistedmatrix.com> wrote:

> I think that we can satisfy all of these requirements by expanding the notion 
> of an "identifier" a bit.

In our last episode, I mentioned that an identifier should have several 
properties:

human readable (and potentially non-unique) for ease of writing messages and 
reading them to get a general sense of the behavior of a system
globally unique (and potentially hard to read) for ease of filtering and 
monitoring
long-term (and potentially unique to humans) for ease of writing tools to deal 
with particular messages, whose exact details may change over time

I think we can achieve all of these things by making there be multiple ways to 
identify a log message.  Here's how that could work:

When you are first writing a log message, just do this, as you usually would; 
toss off some crummy english message:

log.info("hey check it out I just {verb}ed a {noun}", verb=request.method, 
noun=request.uri)

Asking the author to write a log message with more detail and care than this is 
basically just a recipe for disappointment :).

Later, someone might want to instrument these messages.  We've all agreed that 
re.findall("hey check it out.*") is a crummy way to do this.  The obvious thing 
would be something like this:

tapEventsLike(myObserver, publisher, log_format="hey check it out I just 
{verb}ed a {noun}")

That's fine to start with, of course, but it's not very resilient to changes in 
the code; the developer might later change this line to read:

log.info("hey check it out I just {verb}ed a {noun} and got a {result}", 
verb=request.method, noun=request.uri, result=request.response.status)

Now the log tap is broken, and there's no reasonable way to fix it; the format 
string doesn't have a place that it could go.

But, the 'log' in the above example is a logger object associated with a class 
or module, and so has 3 persistent pieces of information it can use to build a 
long-term identifier that the event tap can use.  We could reflect that in an 
API like this:

from example.module import log
tapEventsLike(myObserver, publisher,
              log_id=log.idForOriginalFormat("hey check it out I just {verb}ed 
a {noun}"))

which could be computed offline, and then specified like this, to be even more 
resilient to API changes:

tapEventsLike(myObserver, publisher, 
log_id='604e45b5-2d41-5ba5-9ae1-7205163c3e3f')

That string, of course, is computed like this:

>>> import uuid
>>> from urllib import urlencode
>>> uri = "http://twistedmatrix.com/ns/log/?"; + urlencode(dict(format="hey 
>>> check it out I just {verb}ed a {noun}", module="example.module"))
>>> uuid.uuid5(uuid.NAMESPACE_URL, uri)
UUID('604e45b5-2d41-5ba5-9ae1-7205163c3e3f')

(Although if we decide we like this system, we could of course go with a 
helpful command-line tool to generate these for you.)

As you evolve your codebase to take logging more seriously, you could change 
the original 'log.info' call to instead look like this:

verbANoun = log.declare(
    "hey check it out I just {verb}ed a {noun} and got a {result}",
    log_id='604e45b5-2d41-5ba5-9ae1-7205163c3e3f',
)
# ...
verbANoun(verb=request.method, noun=request.uri, result=request.response.status)

and then monitoring can be accomplished via public API:

from example.module import verbANoun
verbANoun.tap(myObserver, theLogPublisher)

So, after some evolution, our message has 3 identifiers:

the original format string, which was easy to come up with on the spot, and 
need not be unique
a UUID, that was automatically generated but can be carried around to allow for 
cleaning up the codebase and the message without breaking any code doing 
monitoring
an API name that the logger and the logee can use to agree on the semantics of 
a particular message

Best of all, you can get this even for the plethora of existing log messages 
within Twisted that aren't even using structured values, as we evolve them to 
have more metadata.  You also now have the ability to localize the format 
string, if you want to start presenting these log messages in contexts other 
than log files for system administrators.

Thoughts?

-glyph

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to