[
http://issues.apache.org/jira/browse/HADOOP-211?page=comments#action_12413170 ]
Sanjay Dahiya commented on HADOOP-211:
--------------------------------------
On default Log4J configuration format -
Some points -
1. Logging caller class information like - originating method/line no, File
name are known to be expensive operations. Also they may not be supported by
All JVMs. Do we want these in default config ?
2. For namenode I added an option %X{client}, this enables logging client
identification along with msg but needs the code to supply that information
using MDC.put("client", clientName)
3. We could do a seperate logger per client but thats a bad idea as there will
be too many loggers.
4. We can have logger hierarchy based on packages/classes and/or some logical
hierarchy like - namenode.block.allocation, namenode.block.removal etc. Any
comments on what type of log hierarchy you would like to see for the modules
you worked on ?
5. We can use MDC for categorizing logs on some criterion other than clients -
like blocks, file system. Anythng you would like to see here ?
A log4J format to start with - default is a RollingFile ( based on size, can
make it based on time also ). Pls let me know your requirements if any and I
will keep updating this file, once this done migrate codebase to log4j.
-------------------------------------------------
# remove DEBUG if not needed.
log4j.rootLogger=DEBUG, RFA
log4j.threshhold=ALL
# Rolling File appender configuration
log4j.appender.RFA=org.apache.log4j.RollingFileAppender
log4j.appender.RFA.File=${HANDDOP_HOME}/hadoop.log
# change file size
log4j.appender.RFA.MaxFileSize=1MB
log4j.appender.RFA.MaxBackupIndex=10
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L))
- %m%n
# logically seperated logger configurations
log4j.logger.namenode=RFA
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %X{client} %c{2}
(%F:%M(%L)) - %m%n
> logging improvements for Hadoop
> -------------------------------
>
> Key: HADOOP-211
> URL: http://issues.apache.org/jira/browse/HADOOP-211
> Project: Hadoop
> Type: Improvement
> Versions: 0.2
> Reporter: Sameer Paranjpye
> Assignee: Sameer Paranjpye
> Priority: Minor
> Fix For: 0.3
> Attachments: commons_logging_patch
>
> Here's a proposal for some impovements to the way Hadoop does logging. It
> advocates 3
> broad changes to the way logging is currently done, these being:
> - The use of a uniform logging format by all Hadoop subsystems
> - The use of Apache commons logging as a facade above an underlying logging
> framework
> - The use of Log4J as the underlying logging framework instead of
> java.util.logging
> This is largely polishing work, but it seems like it would make log analysis
> and debugging
> easier in the short term. In the long term, it would future proof logging to
> the extent of
> allowing the logging framework used to change while requiring minimal code
> change. The
> propos changes are motivated by the following requirements which we think
> Hadoops
> logging should meet:
> - Hadoops logs should be amenable to analysis by tools like grep, sed, awk
> etc.
> - Log entries should be clearly annotated with a timestamp and a logging level
> - Log entries should be traceable to the subsystem from which they originated
> - The logging implementation should allow log entries to be annotated with
> source code
> location information like classname, methodname, file and line number,
> without requiring
> code changes
> - It should be possible to change the logging implementation used without
> having to change
> thousands of lines of code
> - The mapping of loggers to destinations (files, directories, servers etc.)
> should be
> specified and modifiable via configuration
> Uniform logging format:
> All Hadoop logs should have the following structure.
> <Header>\n
> <LogEntry>\n [<Exception>\n]
> .
> .
> .
> where the header line specifies the format of each log entry. The header line
> has the format:
> '# <Fieldname> <Fieldname>...\n'.
> The default format of each log entry is: '# Timestamp Level LoggerName
> Message', where:
> - Timestamp is a date and time in the format MM/DD/YYYY:HH:MM:SS
> - Level is the logging level (FATAL, WARN, DEBUG, TRACE, etc.)
> - LoggerName is the short name of the logging subsystem from which the
> message originated e.g.
> fs.FSNamesystem, dfs.Datanode etc.
> - Message is the log message produced
> Why Apache commons logging and Log4J?
> Apache commons logging is a facade meant to be used as a wrapper around an
> underlying logging
> implementation. Bridges from Apache commons logging to popular logging
> implementations
> (Java logging, Log4J, Avalon etc.) are implemented and available as part of
> the commons logging
> distribution. Implementing a bridge to an unsupported implementation is
> fairly striaghtforward
> and involves the implementation of subclasses of the commons logging
> LogFactory and Logger
> classes. Using Apache commons logging and making all logging calls through it
> enables us to
> move to a different logging implementation by simply changing configuration
> in the best case.
> Even otherwise, it incurs minimal code churn overhead.
> Log4J offers a few benefits over java.util.logging that make it a more
> desirable choice for the
> logging back end.
> - Configuration Flexibility: The mapping of loggers to destinations (files,
> sockets etc.)
> can be completely specified in configuration. It is possible to do this with
> Java logging as
> well, however, configuration is a lot more restrictive. For instance, with
> Java logging all
> log files must have names derived from the same pattern. For the namenode,
> log files could
> be named with the pattern "%h/namenode%u.log" which would put log files in
> the user.home
> directory with names like namenode0.log etc. With Log4J it would be possible
> to configure
> the namenode to emit log files with different names, say heartbeats.log,
> namespace.log,
> clients.log etc. Configuration variables in Log4J can also have the values of
> system
> properties embedded in them.
> - Takes wrappers into account: Log4J takes into account the possibility that
> an application
> may be invoking it via a wrapper, such as Apache commons logging. This is
> important because
> logging event objects must be able to infer the context of the logging call
> such as classname,
> methodname etc. Inferring context is a relatively expensive operation that
> involves creating
> an exception and examining the stack trace to find the frame just before the
> first frame
> of the logging framework. It is therefore done lazily only when this
> information actually
> needs to be logged. Log4J can be instructed to look for the frame
> corresponding to the wrapper
> class, Java logging cannot. In the case of Java logging this means that a)
> the bridge from
> Apache commons logging is responsible for inferring the calling context and
> setting it in the
> logging event and b) this inference has to be done on every logging call
> regardless of whether
> or not it is needed.
> - More handy features: Log4J has some handy features that Java logging
> doesn't. A couple
> of examples of these:
> a) Date based rolling of log files
> b) Format control through configuration. Log4J has a PatternLayout class that
> can be
> configured to generate logs with a user specified pattern. The logging format
> described
> above can be described as "%d{MM/dd/yyyy:HH:mm:SS} %c{2} %p %m". The format
> specifiers
> indicate that each log line should have the date and time followed by the
> logger name followed
> by the logging level or priority followed by the application generated
> message.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira