Ceki Gülcü <[EMAIL PROTECTED]> wrote on 01/24/2005 11:06:21 AM: > Richard Sitze wrote on 2005-01-21 20:07:49: > > >The purpose of logging is to capture enough state of a system, so that > >when you identify *where* the error occurs in the logs, you have > >information that can help you identify the cause of the error. > >Class/method 'boundries' are convenient for many reasons, not least of > >which is that in a well designed [dare I say OO?] modularized software > >component the methods are [typically] relatively short and "functionally > >defined" by their parameters. Stack traces for exceptions [if you HAVE an > >exception] can identify a method, but don't capture the state of the > >system in the same way that an entry log entry [eeks] would. Entry/exit > >help you identify the state of the system at the time a fault occurs, and > >in many cases can identify the fault itself. > > In other words, you are advocating the use of logging as a sort of > persistent debugger. Clearly, a debugger can provide a perfectly > detailed view of the system, down to the register level. However, no > one usually cares about that much detail. Not only is that level of > detail distracting, it will noticeably slow down the application and > hog megabytes of disk space. Is indiscriminate logging IBM's secret > justification for ever bigger and faster machines?
1. It's not indiscriminate logging. That's why we call it tracing, and that's why the logging impls [such as Log4J] allow us to turn it on/off. 2. The fact of the matter is that in many corporate operational environments, production-level [that's *not* development] systems are guarded carefully. Installing additional components [debuggers], never MIND running them, is not an option. When one fails to duplicate a problem in a development environment, customers *expect* us to be able to glean information from productions system that can demonstrate the problems. 3. Are your pokes/jabs at me for being an IBM employee necessary? I've spent 2/5 of my career at IBM. When I exceed 1/2, you come on back and I'll listen to you more seriously. My goal is to be an advocate of the technologies and techniques I see that can benefit the community. I *will* argue for what I believe to be in the best interest of the development community. And when the decisions are made, I will support them. That's the stance I take with my employer: defending open-source decisions that I may not agree with. That I am tainted by my experience, environment, etc., is a fact of life.. and part of the process. It's the strength of open source development. It's why we have developers from many backgrounds make up our community. Why do you want to make that a negative? > > >Likewise, errors can be caused by earlier events....the entry/exit flow > >information can help identify the factors related to such.. again based on > >conveniant boundries in the code. > > Theoretically speaking, there is no denying that as the quantity of > information about the system increases, so does the probability of > finding the cause of a given error. But where do you stop? I claim that entry/exit at class/method is reasonable.. along with other best-practices I've described in other notes. It's easily identifiable, it's convenient, it's easy to automate [AspectJ or other tooling], and more importantly it's *useful*. Should it stop there? No. Is it a reasonable, and easily understood, starting point for instrumenting code with logging? Yes. > >I do agree that with some serious planning [thought], and effort that > >there may be more helpful bits of information to log in some cases, but if > >experience shows anything it is that these are typically identified from > >hindsight in many components. We don't always *have* the luxury of > >hindsight. In these cases, the more information logged the better. > > Let me present this from a different angle. > > Logging is about fixing the next run, not the current one. The current > one failed, rememberer? So, in some sense, you always have the luxury > of hindsight, almost by construction. > > You can rely on log files with tons and tons of noise, or you can look > at less detailed logs but which still provide a view of the big > picture. Armed with the big picture you can try to reproduce the > problem on your side. Once you are able to reproduce the error, you can > fix it. The more you understand about the macro picture, and where a problem lies, the more you want information on the micro picture in that area. I've personally been in situations where the errors could not be duplicated "in the lab". Differences in hardware, and in the particular case in my mind, even to the extent of different revisions of firmware on otherwise similar systems introduce variable that are not always easily comprehended. > >entry/exit logging is a best-practice that is *easily* understood by > >developers, it provides a minimal set of information that can offer a > >significant improvement over speriodic logging within the code. > > > >Now... if you feel that such methods are not appropriate for the 'general' > >case... while I disagree, I won't argue *too* strongly.. but I would point > >out that we have requested these API's under the guise of 'Enterprise > >Logging' services ;-). > > Now that you mention it, one the reasons which got me working on log4j > and its filtering mechanisms, was the thousands of lines of garbage > generated by the logs of an IBM product, which will go unnamed > here. You seem approach logging from a different perspective > altogether: > > Dump as much information as you can, worry about sorting through > the mess later. > > Is that a fair description? We are both skilled at lacing our comments with words designed to accentuate the negative. :-) The key is filtering, being able to turn on/off the information you believe you need to identify a problem. The notion of being able to "drill down" by turning on combinations of levels and categories, introduced by Log4J, is excellent. It's one of the reasons many developers chose to use Log4J as the impl. under JCL. That said, filting doesn't do any good, unless there is something to expose. Conversely, I don't care *how* little you expose, given a sufficiently large system and a sufficiently long runtime, a log will become "a mess". By having enter/exit methods identified, it is easy to capture flow through a system that is *in-sync* with the other details presented [what I'm refering to is that yes, there are profiling tools that can capture flow, but it's very convenient to have that interlaced with the log... and no, timestamps are not always sufficent to sort properly]. This allows you to walk the flow, and focus on the more detailed information [as long as it's in the logs]. With respect to the "big mess", that's a technical problem. Tooling [grep is GREAT] helps manage this. In addition, IBM has previewed [alphaworks] Eclipse based tooling for log file analysis. This is a reasonable example of one way to "manage the mess" of log files. http://www.alphaworks.ibm.com/tech/logandtrace I'm sure there are other tools and techniques out there. My Point: It's important to have the information, and it's important to have the tools. > > -- > Ceki Gülcü > > The complete log4j manual: http://www.qos.ch/log4j/ ******************************************* Richard A. Sitze IBM WebSphere WebServices Development