Re: Logging JEP

Fredrik Arvidsson Sun, 18 May 2014 14:52:20 -0700

Hi Kirk

Please see my comments in-lined below.


This starts to become a wall of text. TL; DR; warning...

On 2014-05-18 19:57, Kirk Pepperdine wrote:

Hi Fredrik,

comments below…

Regards,
Kirk

My primary objection with the current specification is that the JEP is both 
over-reaching and restrictive. It suggest taxonomy a closed hierarchical 
taxonomy instead of a more open and flexible arrangement. The JEP overreaches 
in that it defines or suggests extremely limited levels which are all 
completely devoid of context or context that allow one to pull out certain 
pieces of information in exclusion to other pieces of information. It also 
allows for developers to make very arbitrary decisions as to what to log at 
each “levels”. Where as PrintAdapativeSizingPolicy gives a very clear 
indication as what will be logged, G1.logging=WARNING is… well... I don’t know 
what is and is not suppose to be included. In addition, maybe 
PrintAdaptiveSizingPolicy is not interesting but PrintTenuringDistribution is. 
And lets say they are both at WARNING log level. Even better, lets say that 
with CMS you get PrintTenuringDistribution at FINE and with G1 you get it at 
FINEST. The point of all this is the logging framework is complicating 
questions that are none of it’s concern. So I would suggest that the logging 
framework get out of the business of trying to predefine categories of logging, 
especially hierarchical categories of logging.

With the hierarchical logger design proposed in the JEP I cant see why you 
could not get the behaviour you're asking for. By targeting sub/child loggers 
it is possible to enable/disable arbitrary levels in the tree. And at the same 
time you could enable logging for lets say all GC related activity by enable 
the top most logger (root). It is possible to create sparse configurations 
including/excluding single loggers.

I think that pub/sub is a much simpler model that is easier to understand than 
what you’ve described.

I don't think the difference is that big. Enabling logging on a loggeris essentially the same as subscribing to a topic.

The logger hierarchy does not need to be 'component oriented' even if I think 
it makes the most sense.
I think you put far too much emphasis on log levels.

Funny but I think the spec puts far too much emphasis on log levels.

Loggers and Adapters are the most important things in the JEP. Loglevels are just to a way to select the verbosity of the output.

Log levels should not be used to categorise information.

In practice they do categorize information because there is no other practical 
way to categorize it. I set verbosity to debug because I expect all debug 
messages to show up.

It should be used to select the verbosity of the log data output.

I guess to understand this further I’d need to see an example of how it would 
be used. For example, this is a single G1GC event

1.047: [Full GC (System.gc()) 17M->3728K(13M), 0.0212570 secs]
    [Eden: 18.0M(24.0M)->0.0B(5120.0K) Survivors: 0.0B->0.0B Heap: 
17.2M(256.0M)->3728.7K(13.0M)]

How would you suggest that the JVM use the logging framework to log this 
message?

This information should probably be logged in the 'info' level since thepurpose of the information is to provide information. Maybe 'debug'level could be used to log even more information about how the numberswere computed.

We have only five levels of verbosity in the JEP (plus disabled). I think that 
is a well balanced number.

I don’t know how to respond to this without risking sounding inflammatory and 
I’d really like not like to be inflammatory. ;-)

Well, you just did ;)

By pre defining all available loggers(categories) in the source code we 
actually give the user the possibility to enumerate all those, and see what 
different type of information that are in there.

Why do you need to know levels prior to iterating? Pub/Sub does not come with 
this requirement.

If you are not to get a lot of noise/unwanted information you would haveto know what topics to subscribe to. I can't really see the differencehere. its like saying that you want news, and then start subscribing toall newspapers. You will get news, but a lot of it you're not interested in.

On top of that we get compile time checks of that its valid.
Your proposal would require any module implementer to carefully document all 
'tags' used in some place in some way. The latter would not make it easier for 
users, it would make it even worse than it is today.

What I read here, and maybe I’m reading it wrong, is that the log levels are 
used to categorize messages and that simply using your predefined tags 
alleviates developers from carefully documenting the information that is 
included with each tag/level. This is exactly one of the bigger problems that I 
see in the field. Developers drop things into a log level and they don’t 
document what comes with each level. You can’t relieve developers of the 
responsibility of documenting what will happen when I turn the logging level to 
debug (or some other level).

Log levels aren't used to categorize anything They are used to selectverbosity of the output. Loggers are what is used to categorize theinformation/messages. Most of the GC logging discussed would as I see itend up in the 'info' log level in a couple of loggers, maybecorrsponding todays -XX:+PrintXXX. Debug logging and even more verboseinformation would be put in the debug and trace level. The latter two isnothing normally used to get GC data/instrumentation into log files.

Instead I believe it is more useful for the framework to support more of a 
pub/sub model where each of the frameworks or components in the JVM are free to 
devise a logging scheme or taxonomy that makes sense for that component.

This would not be something targeted by this JEP. The JEP title is: *Unified* 
JVM Logging. what you describe is similar to what we have today. All modules 
have their own way of doing logging. This whole discussion is about addressing 
that problem, not making it a rule. I have gotten heaps of feedback from 
developers asking me to bring an end to this arbitrary 'log as you like' 
situation they live in today.

I’m not advocating what is in the JVM today. Unifying the logging scheme makes 
sense. It allows for better standardization of tooling, just like we have with 
JMS client technology.

Well if we leave it up to each module to devise a logging scheme and thetaxonomy for its logs, in what way would this make the logging more unified?

For example, Twitter provides a communication platform. By convention we all 
use @ and # to specify targets and subjects. Imagine if Twitter limited all # 
tags to be Fine, Finer and Finest. It’s a thought that is so crazy you wouldn’t 
even consider it. Twitter does *not* try to define what those targets and 
subjects are as there is not reasonable way in which it can. So it’s not an 
accident that these decisions are left up to the users to decide. I’m not 
suggesting the JVM logging framework copy Twitter. I’m merely  using it as an 
example of an open and flexible logging system.

Again I think you are confusing log levels (verbosity) with loggers. Log levels 
are not depicting information topics, or the actual source of the information, 
log levels should be used to filter the amount of information about one topic.

Again, rightly or wrongly this is the way that loggers get used because 
categorization is not part of the design.

If we leave the log levels out for a moment. I cant see the realdifference between using loggers and @/# to categorise the loggedinformation. I only see positive things ordering the information in ahierarchy where possible and desirable.

I don't think Twitter is a good analogy in this case. If I got as much unwanted 
and unrelated noise/crap in my log output as on Twitter I would be a very 
unhappy user.

No analogy works perfectly but there are aspects of the analogy that I do 
believe are valid. If I subscribe to listen anything you pump out, I see it. If 
I don’t want to see it any more, I can unsubscribe. Twitter hash tags are 
interesting in that twitter is not predefining them. Somehow I have to document 
and publish the fact that I’m publishing information out on an hashtag. One 
other aspect that I do like is limiting to 140 characters (which would do 
wonders to improve logging) but that is an aside and I don’t really want to get 
bogged down in this analogy if it doesn’t happen to work for you.

140 characters per line is an exciting idea. I agree it would improvethe logging very much.

I spoke to two people about logging while at GeeCon last week. The first was 
Peter Lawrey, the author of JChronicle. JChronicle is a messaging framework but 
Peter has written a Log4J adaptor for it. I’ve been begging him *not* to 
release it as I think it would comply obscure the excellent work that put into 
JChronicle in making it a truly high throughput low latency messaging. I think 
he’s been thinking about a better interface for it.

It would be very interesting to have feedback directly from Peter. I don't know 
the JChronicle product so I cant really comment. What I do know is that Log4J 
is a very liked and well used framework, much too big, dynamic and complex for 
our use. But there was probably a reason for Peter trying to provide his users 
with that functionality. People are using it, and they love it.

I know Log4J is very well liked until you get bit in a$$ by it.. and I run into 
projects on a regular basis that have gotten bitten very hard. I include an 
entire section of my performance tuning course to the subject and I have to say 
that a number of organizations have changed how they log as a result. BTW, 
being in the majority just means you’re in the majority, it doesn’t mean the 
majority is correct. I shan’t bore you with historical evidence to that point 
;-)

Sure, you're probably right about some cases suffer from this design.Question is if the JVM logging would be better off with somethingradically different from the design I present in this JEP? We are notaiming to be non-intrusive with this framework. There are other moreviable solutions to that problem.

I also spoke to Michael Feathers who gave the keynote where he focused on no 
error handling in an application. We both agreed on the egregious state of 
application logging. It’s so bad that I’m constantly running into applications 
where the primary bottleneck is in how logging has been implemented. One of the 
key problems is the use of this hierarchal log levels with no guidelines as to 
what should appear at each level. Thus to get any information out of the 
application, one must eat an entire elephant. It’s a deplorable situation.. and 
while one can blame developers I could also say that frameworks should guide 
developers to good decisions, not pull them away from them.

As I mentioned before I think we are addressing the 'Elephant' issue by using 
sub loggers. Again, and I'm not trying to be repetitive here, I think you 
confuse log levels as in verbosity with loggers(categories).
I am a bit confused about your comment that you would like to let each module 
decide how and what to log, and then in the above text you say 'log levels with 
no guidelines as to what should appear at each level'. How would your first 
proposal make this easier in any way? Putting the descision on how and what to 
log in the hands of tens (in worst case hundreds) of individual developers 
would not make logging better I think. It would make it even more sprawling and 
hard to use. We would get as many logging standards as there are developers 
multiplied with the number of modules using logging.

Indeed but if you take on that problem this framework will only serve to make 
the problem worse. Log4J and ilk all use ideas that are old and I think 
disproven by a number. Unfortunately they currently have momentum and momentum 
is a difficult thing to reverse. Also re-implenting the past isn’t really 
driving the platform forward.

Old ideas are not necessarily bad. If it works, why try to mend it?

Did I mention no, support for or mention of binary in the spec. I’ve nothing 
against text based logging.. except the fact that it’s often information 
sparse. You certainly should be promoting formats that are as information dense 
as it make sense for your environment/requirements and there is often nothing 
more information dense than a binary format.

This is an interesting note. I have not gotten this requirement from any of the 
many hotspot developers I interviewed during my information gathering.
Personally I am not sure this type of information dense logging should be a 
part of this JEP.

Agreed, I don’t believe this JEP should be in the business of specifying 
payload..  but the text in the JEP would seem to prohibit it.

That should be fixed in the JEP text. It should say something like thatin a first step human readable data is the goal.

There are other mechanisms already implemented that does that kind of stuff way 
better, like JEP-167 :)

Certainly it would be interesting to see how to tie this JEP with 167 as I 
think there is a lot of synergy.

Regards,
Kirk

Cheers
/Fredrik

Re: Logging JEP

Reply via email to