Re: [DISCUSS] Switch to log4j 2
On Fri, Aug 15, 2014 at 8:50 AM, Aaron T. Myers wrote: > Not necessarily opposed to switching logging frameworks, but I believe we > can actually support async logging with today's logging system if we wanted > to, e.g. as was done for the HDFS audit logger in this JIRA: > > https://issues.apache.org/jira/browse/HDFS-5241 Yes, this is a great example of making something async without switching logging frameworks. +1 for doing that where it is appropriate. > > -- > Aaron T. Myers > Software Engineer, Cloudera > > > On Fri, Aug 15, 2014 at 5:44 AM, Steve Loughran > wrote: > >> moving to SLF4J as an API is independent —it's just a better API for >> logging than commons-logging, was already a dependency and doesn't force >> anyone to switch to a new log back end. Interesting idea. Did anyone do a performance comparison and/or API comparison with SLF4j on Hadoop? >> >> >> On 15 August 2014 03:34, Tsuyoshi OZAWA wrote: >> >> > Hi, >> > >> > Steve has started discussion titled "use SLF4J APIs in new modules?" >> > as a related topic. >> > >> > >> http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201404.mbox/%3cca+4kjvv_9cmmtdqzcgzy-chslyb1wkgdunxs7wrheslwbuh...@mail.gmail.com%3E >> > >> > It sounds good to me to use asynchronous logging when we log INFO. One -1. Async logging for everything will make a lot of failures un-debuggable. Just to give one example, what if you get a JVM out of memory crash? You'll lose the last few log messages which could have told you what was going on. Even if the JVM doesn't terminate, log messages will be out of order, which is annoying, and will make debugging harder. The kernel already buffers the log files in memory. Not every log message generates a disk seek. But on the other hand, if the JVM process crashes, you've got everything. In other words, we've already got as much buffering and asynchronicity as we need! If the problem is that the noisy logs are overloading the disk bandwidth, that problem can't be solved by adding Java-level async. You need more bandwidth. A simple way of doing this is putting the log partition on /dev/shm. We could also look into stripping some of the boilerplate from log messages-- there are a lot of super-long log messages that could be much more concise. Other Java logging frameworks might have less overhead (I'm not an expert on this, but maybe someone could post some numbers?) best, Colin >> > concern is that asynchronous logging makes debugging difficult - I >> > don't know log4j 2 well, but I suspect that ordering of logging can be >> > changed even if WARN or FATAL are logged with synchronous logger. >> > >> > Thanks, >> > - Tsuyoshi >> > >> > On Thu, Aug 14, 2014 at 6:44 AM, Arpit Agarwal > > >> > wrote: >> > > I don't recall whether this was discussed before. >> > > >> > > I often find our INFO logging to be too sparse for useful diagnosis. A >> > high >> > > performance logging framework will encourage us to log more. >> > Specifically, >> > > Asynchronous Loggers look interesting. >> > > https://logging.apache.org/log4j/2.x/manual/async.html#Performance >> > > >> > > What does the community think of switching to log4j 2 in a Hadoop 2.x >> > > release? >> > > >> > > -- >> > > CONFIDENTIALITY NOTICE >> > > NOTICE: This message is intended for the use of the individual or >> entity >> > to >> > > which it is addressed and may contain information that is confidential, >> > > privileged and exempt from disclosure under applicable law. If the >> reader >> > > of this message is not the intended recipient, you are hereby notified >> > that >> > > any printing, copying, dissemination, distribution, disclosure or >> > > forwarding of this communication is strictly prohibited. If you have >> > > received this communication in error, please contact the sender >> > immediately >> > > and delete it from your system. Thank You. >> > >> >> -- >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity to >> which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. >>
Re: [DISCUSS] Switch to log4j 2
The block state change logs are indeed too noisy at INFO and I've not found them useful when troubleshooting. Just filed HDFS-6860 to fix that. This is orthogal to SLF4J migration however moving to SLF4J would help ease the transition to Log4j 2. Thanks for the pointer to HDFS-5421 Aaron, looking into it. On Sat, Aug 16, 2014 at 5:07 AM, Steve Loughran wrote: > On 15 August 2014 17:20, Karthik Kambatla wrote: > > > However, IMO we already log too much at INFO level (particularly YARN). > > Logging more at DEBUG level and lowering the overhead of enabling DEBUG > > logging is preferable. > > > > +1 > > This is the log4j properties file I've adopted for minicluster debugging, > HDFS is pretty noisy these days too. BlockStateChange, for example. Then > there's Zookeeper > > > > log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR > > log4j.logger.org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner=WARN > log4j.logger.org.apache.hadoop.hdfs.server.blockmanagement=WARN > log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=WARN > log4j.logger.org.apache.hadoop.hdfs=WARN > log4j.logger.BlockStateChange=WARN > > > log4j.logger.org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor=WARN > > log4j.logger.org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl=WARN > log4j.logger.org.apache.zookeeper=WARN > log4j.logger.org.apache.zookeeper.ClientCnxn=FATAL > > log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.security=WARN > log4j.logger.org.apache.hadoop.metrics2=ERROR > log4j.logger.org.apache.hadoop.util.HostsFileReader=WARN > log4j.logger.org.apache.hadoop.yarn.event.AsyncDispatcher=WARN > log4j.logger.org.apache.hadoop.security.token.delegation=WARN > log4j.logger.org.apache.hadoop.yarn.util.AbstractLivelinessMonitor=WARN > log4j.logger.org.apache.hadoop.yarn.server.nodemanager.security=WARN > log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMNMInfo=WARN > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Switch to log4j 2
On 15 August 2014 17:20, Karthik Kambatla wrote: > However, IMO we already log too much at INFO level (particularly YARN). > Logging more at DEBUG level and lowering the overhead of enabling DEBUG > logging is preferable. > +1 This is the log4j properties file I've adopted for minicluster debugging, HDFS is pretty noisy these days too. BlockStateChange, for example. Then there's Zookeeper log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR log4j.logger.org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner=WARN log4j.logger.org.apache.hadoop.hdfs.server.blockmanagement=WARN log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=WARN log4j.logger.org.apache.hadoop.hdfs=WARN log4j.logger.BlockStateChange=WARN log4j.logger.org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor=WARN log4j.logger.org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl=WARN log4j.logger.org.apache.zookeeper=WARN log4j.logger.org.apache.zookeeper.ClientCnxn=FATAL log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.security=WARN log4j.logger.org.apache.hadoop.metrics2=ERROR log4j.logger.org.apache.hadoop.util.HostsFileReader=WARN log4j.logger.org.apache.hadoop.yarn.event.AsyncDispatcher=WARN log4j.logger.org.apache.hadoop.security.token.delegation=WARN log4j.logger.org.apache.hadoop.yarn.util.AbstractLivelinessMonitor=WARN log4j.logger.org.apache.hadoop.yarn.server.nodemanager.security=WARN log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMNMInfo=WARN -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Switch to log4j 2
Using asynchronous loggers for improved performance sounds reasonable. However, IMO we already log too much at INFO level (particularly YARN). Logging more at DEBUG level and lowering the overhead of enabling DEBUG logging is preferable. One concern is the defaults. Based on what I read on the log4j2 page shared, we might want to keep our audit logging synchronous and make all other logging asynchronous. Is there a way to easily configure it this way; otherwise, what is the dev cost we are looking at? On Wed, Aug 13, 2014 at 2:44 PM, Arpit Agarwal wrote: > I don't recall whether this was discussed before. > > I often find our INFO logging to be too sparse for useful diagnosis. A high > performance logging framework will encourage us to log more. Specifically, > Asynchronous Loggers look interesting. > https://logging.apache.org/log4j/2.x/manual/async.html#Performance > > What does the community think of switching to log4j 2 in a Hadoop 2.x > release? > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >
Re: [DISCUSS] Switch to log4j 2
Not necessarily opposed to switching logging frameworks, but I believe we can actually support async logging with today's logging system if we wanted to, e.g. as was done for the HDFS audit logger in this JIRA: https://issues.apache.org/jira/browse/HDFS-5241 -- Aaron T. Myers Software Engineer, Cloudera On Fri, Aug 15, 2014 at 5:44 AM, Steve Loughran wrote: > moving to SLF4J as an API is independent —it's just a better API for > logging than commons-logging, was already a dependency and doesn't force > anyone to switch to a new log back end. > > > On 15 August 2014 03:34, Tsuyoshi OZAWA wrote: > > > Hi, > > > > Steve has started discussion titled "use SLF4J APIs in new modules?" > > as a related topic. > > > > > http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201404.mbox/%3cca+4kjvv_9cmmtdqzcgzy-chslyb1wkgdunxs7wrheslwbuh...@mail.gmail.com%3E > > > > It sounds good to me to use asynchronous logging when we log INFO. One > > concern is that asynchronous logging makes debugging difficult - I > > don't know log4j 2 well, but I suspect that ordering of logging can be > > changed even if WARN or FATAL are logged with synchronous logger. > > > > Thanks, > > - Tsuyoshi > > > > On Thu, Aug 14, 2014 at 6:44 AM, Arpit Agarwal > > > wrote: > > > I don't recall whether this was discussed before. > > > > > > I often find our INFO logging to be too sparse for useful diagnosis. A > > high > > > performance logging framework will encourage us to log more. > > Specifically, > > > Asynchronous Loggers look interesting. > > > https://logging.apache.org/log4j/2.x/manual/async.html#Performance > > > > > > What does the community think of switching to log4j 2 in a Hadoop 2.x > > > release? > > > > > > -- > > > CONFIDENTIALITY NOTICE > > > NOTICE: This message is intended for the use of the individual or > entity > > to > > > which it is addressed and may contain information that is confidential, > > > privileged and exempt from disclosure under applicable law. If the > reader > > > of this message is not the intended recipient, you are hereby notified > > that > > > any printing, copying, dissemination, distribution, disclosure or > > > forwarding of this communication is strictly prohibited. If you have > > > received this communication in error, please contact the sender > > immediately > > > and delete it from your system. Thank You. > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >
Re: [DISCUSS] Switch to log4j 2
moving to SLF4J as an API is independent —it's just a better API for logging than commons-logging, was already a dependency and doesn't force anyone to switch to a new log back end. On 15 August 2014 03:34, Tsuyoshi OZAWA wrote: > Hi, > > Steve has started discussion titled "use SLF4J APIs in new modules?" > as a related topic. > > http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201404.mbox/%3cca+4kjvv_9cmmtdqzcgzy-chslyb1wkgdunxs7wrheslwbuh...@mail.gmail.com%3E > > It sounds good to me to use asynchronous logging when we log INFO. One > concern is that asynchronous logging makes debugging difficult - I > don't know log4j 2 well, but I suspect that ordering of logging can be > changed even if WARN or FATAL are logged with synchronous logger. > > Thanks, > - Tsuyoshi > > On Thu, Aug 14, 2014 at 6:44 AM, Arpit Agarwal > wrote: > > I don't recall whether this was discussed before. > > > > I often find our INFO logging to be too sparse for useful diagnosis. A > high > > performance logging framework will encourage us to log more. > Specifically, > > Asynchronous Loggers look interesting. > > https://logging.apache.org/log4j/2.x/manual/async.html#Performance > > > > What does the community think of switching to log4j 2 in a Hadoop 2.x > > release? > > > > -- > > CONFIDENTIALITY NOTICE > > NOTICE: This message is intended for the use of the individual or entity > to > > which it is addressed and may contain information that is confidential, > > privileged and exempt from disclosure under applicable law. If the reader > > of this message is not the intended recipient, you are hereby notified > that > > any printing, copying, dissemination, distribution, disclosure or > > forwarding of this communication is strictly prohibited. If you have > > received this communication in error, please contact the sender > immediately > > and delete it from your system. Thank You. > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [DISCUSS] Switch to log4j 2
Hi, Steve has started discussion titled "use SLF4J APIs in new modules?" as a related topic. http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201404.mbox/%3cca+4kjvv_9cmmtdqzcgzy-chslyb1wkgdunxs7wrheslwbuh...@mail.gmail.com%3E It sounds good to me to use asynchronous logging when we log INFO. One concern is that asynchronous logging makes debugging difficult - I don't know log4j 2 well, but I suspect that ordering of logging can be changed even if WARN or FATAL are logged with synchronous logger. Thanks, - Tsuyoshi On Thu, Aug 14, 2014 at 6:44 AM, Arpit Agarwal wrote: > I don't recall whether this was discussed before. > > I often find our INFO logging to be too sparse for useful diagnosis. A high > performance logging framework will encourage us to log more. Specifically, > Asynchronous Loggers look interesting. > https://logging.apache.org/log4j/2.x/manual/async.html#Performance > > What does the community think of switching to log4j 2 in a Hadoop 2.x > release? > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You.
[DISCUSS] Switch to log4j 2
I don't recall whether this was discussed before. I often find our INFO logging to be too sparse for useful diagnosis. A high performance logging framework will encourage us to log more. Specifically, Asynchronous Loggers look interesting. https://logging.apache.org/log4j/2.x/manual/async.html#Performance What does the community think of switching to log4j 2 in a Hadoop 2.x release? -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.