Have some log messages designed for machine parsing, either real-time or 
post-mortem
------------------------------------------------------------------------------------

                 Key: HADOOP-6107
                 URL: https://issues.apache.org/jira/browse/HADOOP-6107
             Project: Hadoop Common
          Issue Type: Improvement
    Affects Versions: 0.21.0
            Reporter: Steve Loughran


Many programs take the log output of bits of Hadoop, and try and parse it. Some 
may also put their own back end behind commons-logging, to capture the input 
without going via Log4J, so as to keep the output more machine-readable.

These programs need log messages that
# are easy to parse by a regexp or other simple string parse  (consider quoting 
values, etc)
# push out the full exception chain rather than stringify() bits of it
# stay stable across versions
# log the things the tools need to analyse: events, data volumes, errors

For these logging tools, ease of parsing, retention of data and stability over 
time take the edge over readability. In HADOOP-5073, Jiaqi Tan proposed marking 
some of the existing log events as evolving towards stability. As someone who 
regulary patches log messages to improve diagnostics, this creates a conflict 
of interest. For me, good logs are ones that help people debug their problems 
without anyone else helping, and if that means improving the text, so be it. 
Tools like Chukwa have a different need. 

What to do? Some options
 # Have some messages that are designed purely for other programs to handle
 # Have some logs specifically for machines, to which we log alongside the 
human-centric messages
 # Fix many of the common messages, then leave them alone.
 # Mark log messages to be left alone (somehow)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to