Re: Solitication for logging/debugging requirements

2010-03-29 Thread Patrick Hunt

Take a look at the logging page in the docs:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging

Some good guidelines in there. Basically we log things at info level 
that are interesting/informational but not logged so frequently that 
they fill the log. WARN is for things that are bad but that we can 
handle (like network connectivity failure). ERROR is generally for 
things we don't expect and are unlikely we can handle. FATAL means 
really bad, we shutdown the server. Many end users log only at WARN 
level or higher in production, so typically we err on the side of WARN 
for issues (so that we have a shot at debugging after the fact). Over 
time, as we gain confidence in production environments, we've been 
pushing more things that were WARN down to INFO.


I fixed a number of JIRAs for 3.3 related to logging. In particular I 
cleaned up the client session logging significantly. The most fertile 
area right now to cleanup logging is in the quorum code. That code in 
particular has issues wrt providing sufficient information to debug 
error conditions. You can easily see this by starting an ensemble of 
greater than 1 machine and try killing one/more of the servers. There 
are many places where the logging is insufficient (eg. "got vote", which 
doesn't say what the vote was or what the effect of such a vote is, 
etc...) Having improved logging in this area would really help.


Try searching on the JIRA
https://issues.apache.org/jira/browse/ZOOKEEPER
of open/closed issues re "log4j" or "logging" or "log" for further insight.

Patrick

Benjamin Reed wrote:
awesome! that would be great ivan. i'm sure pat has some more concrete 
suggestions, but one simple thing to do is to run the unit tests and 
look at the log messages that get output. there are a couple of 
categories of things that need to be fixed (this is in no way exhaustive):


1) messages that have useful information, but only if you look in the 
code to figure out what it means. there are some leader election 
messages that fall into this category. it would be nice to clarify them.
2) there are error messages that really aren't errors. when shutting 
down there are a bunch of errors that are expected, but still logged, 
for example.

3) misclassified error levels

welcome aboard!

ben

On 03/29/2010 10:07 AM, Ivan Kelly wrote:

Hi,

Im going to be using Zookeeper quite extensively for a project in a
few weeks, but development hasn't kicked off yet. This means I have
some time on my hands and I'd like to get familiar with zookeeper
beforehand by perhaps writing some tools to make debugging problems
with it easier so as to save myself some time in the future. Problem
is I haven't had to debug many zookeeper problems yet, so I don't know
where the pain points are.

So, without further ado,
- Are there any places that logging is deficient that sorely needs
improvement?
- Could current logs be improved any amount or presented in a more
readable fashion?
- Would some form of log visualisation be useful (for example in
something approximating a sequence diagram)?

Feel free to suggest anything which the list above doesn't allude to
which you think would be helpful.

Cheers,
Ivan

   




Re: Solitication for logging/debugging requirements

2010-03-29 Thread Benjamin Reed
awesome! that would be great ivan. i'm sure pat has some more concrete 
suggestions, but one simple thing to do is to run the unit tests and 
look at the log messages that get output. there are a couple of 
categories of things that need to be fixed (this is in no way exhaustive):


1) messages that have useful information, but only if you look in the 
code to figure out what it means. there are some leader election 
messages that fall into this category. it would be nice to clarify them.
2) there are error messages that really aren't errors. when shutting 
down there are a bunch of errors that are expected, but still logged, 
for example.

3) misclassified error levels

welcome aboard!

ben

On 03/29/2010 10:07 AM, Ivan Kelly wrote:

Hi,

Im going to be using Zookeeper quite extensively for a project in a
few weeks, but development hasn't kicked off yet. This means I have
some time on my hands and I'd like to get familiar with zookeeper
beforehand by perhaps writing some tools to make debugging problems
with it easier so as to save myself some time in the future. Problem
is I haven't had to debug many zookeeper problems yet, so I don't know
where the pain points are.

So, without further ado,
- Are there any places that logging is deficient that sorely needs
improvement?
- Could current logs be improved any amount or presented in a more
readable fashion?
- Would some form of log visualisation be useful (for example in
something approximating a sequence diagram)?

Feel free to suggest anything which the list above doesn't allude to
which you think would be helpful.

Cheers,
Ivan