Re: [slf4j-dev] Benefits of Message objects?

Joern Huxhorn Fri, 09 Sep 2011 13:52:55 -0700

On 09.09.2011, at 19:11, John Vasileff wrote:

> 
> On Sep 9, 2011, at 6:23 AM, Joern Huxhorn wrote:
> 
>> 
>> On 08.09.2011, at 21:11, John Vasileff wrote:
>> 
>>> I put together the following examples to help net out benefits of 
>>> supporting "Map supplementalData" arguments in Logger methods in addition 
>>> to Message object support.
>>> 
>>> Support for supplementalData does not replace Message objects, but is 
>>> perhaps more of a shortcut for common usage scenarios.  In the end, a 
>>> Message object is always created by either the application writer or SLF4J.
>>> 
>>> 
>>> Scenario #1: Working with a domain object "posting" that implements 
>>> java.util.Map.  A 1.6.2 error log may look like:
>>> 
>>> logger.error("error parsing markdown '{}'", badString, e);
>>> 
>>> but we want to add all details of the "posting" domain object without 
>>> including them in the message.  Using the supplementalData log methods:
>>> 
>>> // just add the data
>>> Logger logger = LoggerFactory.getLogger(MyClass.class);
>>> logger.error(posting, "error parsing markdown '{}'", badString, e);
>>> 
>>> // add the data, but also allow named parameter formatting
>>> Logger logger = 
>>> LoggerFactory.getLogger(MyClass.class).formatWith(NamedParamFormatter.getInstance());
>>> logger.error(posting, "error parsing markdown for posting {id} '{}'", 
>>> badString, e);
>>> 
>>> // just add the data, and format using java.util.formatter
>>> Logger logger = 
>>> LoggerFactory.getLogger(MyClass.class).formatWith(JavaUtilFormatter.getInstance());
>>> logger.error(posting, "error parsing markdown '%s'", badString, e);
>> 
>> One huge problem with this approach is that Logger, for a given name/class, 
>> has to be a singleton. You absolutely must not change its state, otherwise 
>> chaos will happen.
>> 
>> I'll give you just one example:
>> Suppose ThreadA is changing the Logger with formatWith. Then, before ThreadA 
>> executes its log statement, ThreadB is changing the same instance with a 
>> different formatWith call. This will cause ThreadA to log with a wrong 
>> formatting. A mutable Logger would automatically make it non-threadsafe.
>> 
>> I firmly believe that default formatting must not be configurable. Doing so 
>> would just introduce a big pile of problems.
>> 
> 
> I agree 100% that formatting should never be changed once the logger is 
> setup.  And I also see that the same instance of Logger is returned for a 
> given name in, for example, Log4jLoggerFactory.  But if in the end the idea 
> of formatWith() makes for a nice application facing API, I would hate to rule 
> it out prematurely based on what may be implementation details.


I'm not saying that this would be impossible to implement. It just comes with 
an enormous cost that is actually threefold:
- on the SLF4J implementation side, getLogger() would have to return a new 
instance for every call. That instance would need to contain an optional 
formatter that must be configurable on a per Thread base (there's a nasty 
classloader memory leak lurking in the dark at this point, btw). And what 
should happen in case of multiple calls to formatWith()? An Exception?
- all of the above would have a serious impact on the performance.
- you'd need to educate the users about all this. This task is actually 
entirely futile. ;)

> 
> Having said that, I don't really have a strong need for formatWith(...) - 
> java.util.Formatter seems unnecessary to me, so it would just be an added 
> convenience for NamedParameter style formatting.  But I do feel that whether 
> in 2.0, 3.0, or whenever, the ability to add this type of feature without 
> breaking binary compatibility is very valuable.  So I think it is worth 
> netting out, even if the consensus is that the particular feature of 
> overriding the formatter isn't necessary or desired for 2.0.
> 
>>> 
>>> If we used Message objects:
>>> 
>>> Logger logger = LoggerFactory.getLogger(MyClass.class);
>>> 
>>> // just add the data
>>> logger.error(new DefaultMessage(posting, "error parsing markdown '{}'", 
>>> badString, e));
>>> 
>>> // add the data, but also allow named parameter formatting
>>> logger.error(new NamedParamFormattingMessage(posting, "error parsing 
>>> markdown for posting {id} {}", badString, e));
>>> 
>>> // just add the data, and format using java.util.formatter
>>> logger.error(new JavaUtilFormatterMessage(posting, "error parsing markdown 
>>> for posting %s", badString, e));
>>> 
>>> 
>>> Scenario #2: Support for StructuredData.  The StructuredData object may 
>>> augment a log message, or both augment the message and provide the message 
>>> with toString().
>>> 
>>> Using SupplementalData methods, where SD implements Map
>>> 
>>> StructuredData sd = new StructuredData(...);
>>> logger.error(sd);
>>> logger.error(sd, "some log message"); 
>>> 
>>> Using Message object methods, where SD implements Message
>>> 
>>> StructuredData sd = new StructuredDataMessage(...);
>>> logger.error(sd);
>>> StructuredData sd = new StructuredDataMessage(..., "some log message");
>>> logger.error(sd);
>>> 
>> 
>> The idea was to put both the internal structure and the formatting in the 
>> respective Message implementation, with ParameterizedMessage being the 
>> implementation for the default SLF4J formatting. A shortcut exists for this 
>> type of messages, namely the foo(String pattern, Object... args) methods, 
>> since this is the message type encouraged by the logging framework.
>> 
>> It is encouraged since it is a perfect mix of features (placeholder 
>> replacement) and speed (it is *much* faster than JUL and likely faster than 
>> most other implementation). It is also the type of formatting already 
>> documented by the API and accepted by its users.
>> 
>> The 99% of the users will just keep on using the "normal" logging methods 
>> and ignore the newly defined ones. This is the main target we are developing 
>> for.
>> 
>> The foo(Message message) type of methods, however, enable the 1% "pro users" 
>> to implement specific features they need - and the logging framework will be 
>> able to (optionally, you could filter on the Message class) cope with those 
>> arbitrary Message types in a default way or implement specialized handling 
>> if that is required.
>> 
>> Take the StructuredDataMessage ( 
>> http://bugzilla.slf4j.org/show_bug.cgi?id=148 ) as an example. It does not 
>> just define a Message with key-value pairs in a Map<String, String>. 
>> Instead, it is more specific. It is used to create messages that conform to 
>> ( http://tools.ietf.org/html/rfc5424 ). This involves making sure that the 
>> keys must have a maximum length of 32 characters. Such a requirement can't 
>> be met by the abstract logging interface. It is an implementation detail of 
>> RFC5424 Syslog messages.
>> In addition to the Map<String, String>, such a StructuredDataMessage could 
>> implement the whole range of the specification, e.g. it could bypass the 
>> normal logging levels and make use of the Syslog severities instead.
>> 
> 
> In the 99% group, there are some (me!) that would really like to keep 
> everything as is, but also be able to quickly and conveniently add some 
> key/value pair data to the log entry.

But you already have this ability using the MDC. What Ralph requested in #148 
was specific support for the Syslog format as specified by RFC5424

> 
> I was trying to show that this also supports bug 148.  StructuredData could 
> implement Map<String,Object>.  Loggers that understand only 
> Map<String,Object> would at least handle the parts that are exposed by the 
> Map interface.  Loggers that understand StructuredData could provide 100% 
> custom handling, such as creating a Log4j2 StructuredDataMessage from the 
> StructuredData object.  In either case, key length validation, etc. would be 
> handled by the StructuredData class.

But when would it be handled and what happens if the key is too long? Would the 
Logger throw an exception in case of Log4J2 but work in case of Logback? Or 
would the key-length be truncated to 32 chars? This could result in duplicates 
that overwrite each other.
Using Syslog with RFC5424 is a rather specific requirement. I never heard of it 
before Ralph issued that bug report. I have serious doubt that lots of users 
know about it. And what happens if the RFC is superseded by a new one that 
suddenly allows 64 chars? How would you implement this supporting both the 32 
and the 64 version?
If all of this is logic is contained in specific Message implementations then 
it would be a no-brainer to support either one. The documentation of the 
requirements could reside in the Javadoc of the respective Message and everyone 
is happy.
It wouldn't be a very good idea to directly implement the Map interface as this 
interface does not impose any of those restrictions. It would be technically 
possible but semantically not entirely correct.

> I think it would be valuable to introduce supplementalData/namedParameters to 
> the masses, while also providing things like StructuredData to advanced users.

Having messages with placeholders like {variableName} come at a cost.
Would you really like to use it regularly even if it imposes worse performance 
and also forces you to write bigger logging statements? Are you really sure? ;)
I know that it *sounds* nice, initially, but is there really a use-case beside 
"Oh, that would be neat."?

The statement
logger.info("User {}", user);
would (minimally, using an absolutely awfully hacky interface (we'd never 
implement it that way)) look like this:
logger.info("User {user}", "user", user);

A more reasonable implementation would even look like this:
Map<String, Object> parameters = new HashMap<String, Object>();
parameters.add("user", user);
logger.info("User {user}", parameters);

It just creates more noise in the code, IMHO.

> To be clear: I am not suggesting we avoid Message/Entry in favor of 
> supplementalData/namedParameters.  I think the Message concept is great and 
> think we should have both.  Use supplementalData/namedParameters as a quick 
> way to add data to an entry, or use Message/Entry to "take over" all data and 
> formatting of the log entry.
> 
>> One should not forget that someone implementing his own message type could 
>> also define a wrapper around the SLF4J Logger to support efficient creation 
>> of his handcrafted Messages. A special SyslogLogger implementation could 
>> automatically match the Syslog severity to a proper SLF4J level, for 
>> example, while still using the whole logging backend without any 
>> wheel-reinvention.
>> 
>> Another example for specific needs would be end-user-facing localized 
>> messages. Why should I reinvent my own logging layer if I could just use 
>> SLF4J/<insert backend here> for that?
>> 
>> Audit logging is another use case.
>> See http://audit.qos.ch/manual/index.html and 
>> http://audit.qos.ch/apidocs/ch/qos/logback/audit/AuditEvent.html 
>> This could be implemented using a AuditMessage instead while still being 
>> able to use the full range of Logback-Classic appenders but with the ability 
>> to add specific appenders handcrafted for specific tasks like writing them 
>> into a special audit database.
>> 
>> One small, additional thing is the possibility of immutable Message 
>> instances that could be kept and reused if there are no variables involved.
>> 
> 
> I think all of these would be supported - I am trying to build on your 
> proposal.
> 
> For localized messages specifically, there may be a really nice way to handle 
> this with "formatWith(...)" - I'll have to give it some thought.  But in any 
> case, Message/Entry objects could always be used.  In any case, I agree 100% 
> that SLF4J should handle localization, otherwise we would be breaking 
> pluggability of back end loggers.

It wouldn't have to, at least not initially. Putting that logic into other 
Message implementations would mean that we don't have to implement it in SLF4J, 
at least not initially.
This could be added conveniently in, say, SLF4J 2.1 or 2.2. Whenever we have an 
implementation ready that we deem good, stable and mature enough to add it to 
the general API.

>> Concerning "COMPATIBILITY #1" in your previous message:
>> It is absolutely crucial that the original interface is left unchanged so 
>> that both use and implementations of that interface will keep on working as 
>> expected. It also needs to stay 1.4 compatible and logging frameworks that 
>> do not require 1.5 should still just implement the old SLF4J interface. The 
>> new interfaces will be available automatically via wrapper in that case.
>> 
>> If the logging implementation requires Java >= 1.5 anyway (e.g. Logback) 
>> then it would be a good idea to implement the new interface instead, 
>> providing backwards-compatibility by means of a wrapper in the other 
>> direction.
> 
> I think we are in agreement here, and we definitely cannot break < 2.0 Logger 
> implementations.  In 2.0, I think we can enhance org.slf4j.Logger, maintain 
> compatibility, and even provide many of the new features to 2.0 api users 
> that choose to use a < 2.0 logger.

But you can not enhance org.slf4j.Logger without breaking compatibility in at 
least the implementations of that interface.
Further, if you enhance org.slf4j.Logger with JDK 1.5 features then it would be 
impossible to use in applications that are still using something else (not only 
JDK 1.4 but probably also Android, I'm not sure).

Leaving the original interface untouched and redefining a new interface in a 
different package is the only way that is really safe, allowing both 
implementations to exist side by side.

Apache enhanced their commons-lang module the same way. See 
http://java.dzone.com/articles/first-look-at-commons-lang-3
This is the only chance of guaranteeing compatibility in both directions. And 
this is crucial if you take into consideration the amount of third-party 
libraries that depend on the current SLF4J. They would simply eat us alive if 
we fail on this.
(both scenarios are already tested in my redesign-branch)

>  If I am right, I think this is a huge advantage, both in improving the 
> feature set for older Logger users, and in sticking with only one interface 
> rather than making users move to a different one (and for new users, avoid 
> confusion about two Loggers, etc.)

Yes, a new package is an annoyance.
People are accustomed to this, though. JUnit did it. commons-lang did it, lots 
of others did it, too.
In contrast to some of the mentioned frameworks, our change could *really* be 
done by a global search & destroy...err....replace over the whole source tree. 
The old API calls will translate perfectly to the new ones. This is not binary 
compatibility *but* semi-source-compatibility (i.e. the package must be 
replaced).

> I'll provide more detail on this soon.
> 
>> 
>> Concerning your Entry interface:
>> The main idea of the very minimal Message interface is the absolute freedom 
>> it gives to the developer. A message does not have to have a pattern and 
>> parameters. Why should it? And why should I create an abstract base-class 
>> that returns null for all methods that are not applicable to that specific 
>> message/entry type? Isn't that enough evidence that the interface would be 
>> too specific?
>> I did consider to add Throwable to the Message interface but ultimately 
>> decided against it for exactly that reason. It just doesn't make sense for 
>> every type of message imaginable.
> 
> My idea is to encapsulate all data and formatting for a log entry, and 
> simplify all interfaces.  The Entry interface basically defines the 
> fundamental data components we are providing to Logger implementations.  
> There aren't that many methods, and having them all here provides the ability 
> for Entry classes to handle all data aspects of an entry, and clarity that 
> this can be done.  I'll add more thoughts on this later.

There simply isn't necessarily anything in common for such entries. An 
AccessLogger is logging an entirely different type of event than a classic 
LoggingEvent. The same is the case for AuditLogging.

> 
>> I'm also not a big fan of including either MDC or NDC in the Message since 
>> those belong to the LoggingEvent, similarly to the call-stack or the 
>> timestamp of the event. Neither of them are visible at the SLF4J interface 
>> level.
>> 
> 
> I haven't spent much time looking at MDC/NDC, but if (somewhere) there is 
> support for named parameters, it might be nice to do:
> 
> ("user {mdcUserId} had a problem with {}", problemString)
> 
> A problem I have had with NDC before is that I have disabled logging NDC info 
> to cut down on noise.  This would provide a mechanism to force a particular 
> value into a message when it is relevant.  Just a thought.

Yes. It would be kind of nice, I agree.

But it would also result in a significantly slower message formatting simply 
because the formatter would have to be a lot more complex. The default 
formatter is deliberately "dumb". It does not support reordering of arguments 
as JUL does, for example. Neither are localized messages supported. This 
results in a way better performance - and we are not talking about 1 ns.
(Perhaps Ceki can share some  numbers on this, if he is still with us ;))

We don't want to degrade the performance of the whole system for everyone 
simply because there is a use-case for messages like this.
If someone wants this functionality then he can implement a Message supporting 
this style of formatting.

We could (and probably should) implement a Message that provides this 
functionality in SLF4J at some point. But we should do so with great care.
I'd imagine an implementation with a fluent interface, i.e. something like
new MappedMessage("User {user} just entered {method}.").add("user", 
"Foouser").add("method","Barmethod")

Keep in mind that those messages won't be localizable, either.

Adding localized messages is not trivial at all if you want to do it properly.
Simply using the system locale would not be sufficient, for example, since 
different locales for different users of a web-application isn't a very 
far-fetched scenario.

We actually really considered building a special (Logback) appender that puts 
logging events (e.g. the last ten) into the session for consumption in the 
view. Right now those would all have the same locale but a special Message 
implementation that would also contain the requested output locale could 
provide such a functionality. This implementation would use the 
JUL-Messagestyle that is referring to arguments using {0} {1:format} and so on 
since the functionality is already available in the JDK and translations must 
be able to reorder and replace arguments..

>> To finish this wall of text:
>> I can not stress enough that the Message interface would enable us to 
>> *filter* on the specific type of Message in the logging configuration and/or 
>> appender implementation. Granted, a better place for this argument would be 
>> the Logback mailing list but it IS a major feature and does indeed open a 
>> huge door of possibilities for future enhancements, all without touching the 
>> logging interfaces again.
>> 
>> Joern
>> _______________________________________________
>> slf4j-dev mailing list
>> [email protected]
>> http://qos.ch/mailman/listinfo/slf4j-dev
> 
> Additional thoughts on StructuredData:  mapping to Syslog severities raises 
> an interesting question.  Perhaps Event should also encapsulate Level and 
> Marker.  I know you are suggesting a more simple Message object, and this 
> pulls in the opposite direction, but it would really provide a clean design 
> whereby an application could specify _only_ the syslog severity, and the 
> StructuredData class could handle mapping to trace/debug/etc. levels.  The 
> interface would simply be:
> 
> public log(Entry entry);
> 
> Entry would provide full power and encapsulation over individual log events.  
> If I had time I would try to rework some of my above comments with this in 
> mind, but we are just sharing ideas, right?

Yes, I'm indeed pulling into the opposite direction. ;)
The LoggingEvent is indeed as much implementation detail as it can get in a 
logging framework.
And they indeed differ significantly in the different implementations (JUL vs. 
Log4J vs. Logback vs. Lilith that is supporting a union of all information 
contained in any of them).

The purpose of SLF4J is to shield the user from all of this by means of a 
common interface that is as simple as possible but not simpler.
This is also one of the reasons why I defined a BaseLogger interface that is 
even simpler than the Logger interface we are accustomed to. One could work 
with just that simple interface, but at the cost of convenience.

You could indeed implement a separate SyslogLogger interface that would 
specifically support exactly this. But the implementation of that interface 
should wrap a SLF4J-Logger, not implement its interface. Otherwise you'd still 
be able to create Syslog-Messages that are logged with the wrong level 
(compared to the severity) by mistake/user-error. And if we can prevent 
user-errors, we should.



I really hope that all this does not come across arrogantly or that you feel 
bashed by me.

I appreciate this discussion very much!

It's just that I have given that whole issue a rather big amount of thought in 
the last years. I stumbled into several dead-ends in the process (extending the 
original Logger interface was one of those) but I'm also pretty confident that 
my latest proposal was pretty win-win concerning the requirements & features. 
The fact that I still like it more than a year later is also a good sign. :D

Cheers,
Joern.
_______________________________________________
slf4j-dev mailing list
[email protected]
http://qos.ch/mailman/listinfo/slf4j-dev

Re: [slf4j-dev] Benefits of Message objects?

Reply via email to