Ceki Gülcü <[EMAIL PROTECTED]> wrote on 01/24/2005 11:06:21 AM:

> Richard Sitze wrote on 2005-01-21 20:07:49:
> 
>  >The purpose of logging is to capture enough state of a system, so that
>  >when you identify *where* the error occurs in the logs, you have
>  >information that can help you identify the cause of the error.
>  >Class/method 'boundries' are convenient for many reasons, not least of
>  >which is that in a well designed [dare I say OO?] modularized software
>  >component the methods are [typically] relatively short and 
"functionally
>  >defined" by their parameters.  Stack traces for exceptions [if you 
HAVE an
>  >exception] can identify a method, but don't capture the state of the
>  >system in the same way that an entry log entry [eeks] would. 
Entry/exit
>  >help you identify the state of the system at the time a fault occurs, 
and
>  >in many cases can identify the fault itself.
> 
> In other words, you are advocating the use of logging as a sort of
> persistent debugger. Clearly, a debugger can provide a perfectly
> detailed view of the system, down to the register level. However, no
> one usually cares about that much detail. Not only is that level of
> detail distracting, it will noticeably slow down the application and
> hog megabytes of disk space. Is indiscriminate logging IBM's secret
> justification for ever bigger and faster machines?

1. It's not indiscriminate logging.  That's why we call it tracing, and 
that's why the logging impls [such as Log4J] allow us to turn it on/off.

2. The fact of the matter is that in many corporate operational 
environments, production-level [that's *not* development] systems are 
guarded carefully.  Installing additional components [debuggers], never 
MIND running them, is not an option.  When one fails to duplicate a 
problem in a development environment, customers *expect* us to be able to 
glean information from productions system that can demonstrate the 
problems.

3. Are your pokes/jabs at me for being an IBM employee necessary?  I've 
spent 2/5 of my career at IBM.  When I exceed 1/2, you come on back and 
I'll listen to you more seriously.

My goal is to be an advocate of the technologies and techniques I see that 
can benefit the community.  I *will* argue for what I believe to be in the 
best interest of the development community.  And when the decisions are 
made, I will support them.  That's the stance I take with my employer: 
defending open-source decisions that I may not agree with.

That I am tainted by my experience, environment, etc., is a fact of life.. 
and part of the process.  It's the strength of open source development. 
It's why we have developers from many backgrounds make up our community.  
Why do you want to make that a negative?

> 
>  >Likewise, errors can be caused by earlier events....the entry/exit 
flow
>  >information can help identify the factors related to such.. again 
based on
>  >conveniant boundries in the code.
> 
> Theoretically speaking, there is no denying that as the quantity of
> information about the system increases, so does the probability of
> finding the cause of a given error. But where do you stop?

I claim that entry/exit at class/method is reasonable.. along with other 
best-practices I've described in other notes.

It's easily identifiable, it's convenient, it's easy to automate [AspectJ 
or other tooling], and more importantly it's *useful*.  Should it stop 
there?  No.  Is it a reasonable, and easily understood, starting point for 
instrumenting code with logging?  Yes.

>  >I do agree that with some serious planning [thought], and effort that
>  >there may be more helpful bits of information to log in some cases, 
but if
>  >experience shows anything it is that these are typically identified 
from
>  >hindsight in many components.  We don't always *have* the luxury of
>  >hindsight.  In these cases, the more information logged the better.
> 
> Let me present this from a different angle.
> 
> Logging is about fixing the next run, not the current one. The current
> one failed, rememberer? So, in some sense, you always have the luxury
> of hindsight, almost by construction.
> 
> You can rely on log files with tons and tons of noise, or you can look
> at less detailed logs but which still provide a view of the big
> picture. Armed with the big picture you can try to reproduce the
> problem on your side. Once you are able to reproduce the error, you can
> fix it.

The more you understand about the macro picture, and where a problem lies, 
the more you want information on the micro picture in that area.

I've personally been in situations where the errors could not be 
duplicated "in the lab".  Differences in hardware, and in the particular 
case in my mind, even to the extent of different revisions of firmware on 
otherwise similar systems introduce variable that are not always easily 
comprehended.

>  >entry/exit logging is a best-practice that is *easily* understood by
>  >developers, it provides a minimal set of information that can offer a
>  >significant improvement over speriodic logging within the code.
>  >
>  >Now... if you feel that such methods are not appropriate for the 
'general'
>  >case... while I disagree, I won't argue *too* strongly.. but I would 
point
>  >out that we have requested these API's under the guise of 'Enterprise
>  >Logging' services ;-).
> 
> Now that you mention it, one the reasons which got me working on log4j
> and its filtering mechanisms, was the thousands of lines of garbage
> generated by the logs of an IBM product, which will go unnamed
> here. You seem approach logging from a different perspective
> altogether:
> 
>    Dump as much information as you can, worry about sorting through
>    the mess later.
> 
> Is that a fair description?

We are both skilled at lacing our comments with words designed to 
accentuate the negative. :-)

The key is filtering, being able to turn on/off the information you 
believe you need to identify a problem.  The notion of being able to 
"drill down" by turning on combinations of levels and categories, 
introduced by Log4J, is excellent.  It's one of the reasons many 
developers chose to use Log4J as the impl. under JCL.

That said, filting doesn't do any good, unless there is something to 
expose.  Conversely, I don't care *how* little you expose, given a 
sufficiently large system and a sufficiently long runtime, a log will 
become "a mess".

By having enter/exit methods identified, it is easy to capture flow 
through a system that is *in-sync* with the other details presented [what 
I'm refering to is that yes, there are profiling tools that can capture 
flow, but it's very convenient to have that interlaced with the log... and 
no, timestamps are not always sufficent to sort properly].  This allows 
you to walk the flow, and focus on the more detailed information [as long 
as it's in the logs].

With respect to the "big mess", that's a technical problem.  Tooling [grep 
is GREAT] helps manage this.

In addition, IBM has previewed [alphaworks] Eclipse based tooling for log 
file analysis.  This is a reasonable example of one way to "manage the 
mess" of log files.

     http://www.alphaworks.ibm.com/tech/logandtrace

I'm sure there are other tools and techniques out there.

My Point: It's important to have the information, and it's important to 
have the tools.

> 
> -- 
> Ceki Gülcü
> 
>    The complete log4j manual: http://www.qos.ch/log4j/

*******************************************
Richard A. Sitze
IBM WebSphere WebServices Development

Reply via email to