[ 
https://issues.apache.org/jira/browse/HADOOP-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582814#action_12582814
 ] 

Raghu Angadi commented on HADOOP-3110:
--------------------------------------

> When there are no DFS clients, it's lightening fast, but add a few Map/Red 
> jobs and the thing really, really slows down. 

Are you implying this was traced to Log messages on NameNode? Of course there 
are a lot improvements to NameNode other parts all the time. _This_ jira I 
thought was about log messages.  There are so many low hanging fruits in 
Hadoop/HDFS w.r.t performance :)

Could you try an experiment where the log4j is configured not to write anywhere 
see if there is any noticeable improvement?

Log message are included because there often the only way to debug a problem. 
We obviously need to have a balance between complexity, maintainability, and 
benefits. So the question here is how much does this save?


> NameNode does logging in critical sections just about everywhere
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3110
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3110
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.14.3, 0.14.4, 0.15.0, 0.15.1, 
> 0.15.2, 0.15.3, 0.16.0, 0.16.1
>         Environment: All
>            Reporter: Pete Wyckoff
>
> e.g., FSNameSystem.addStoredBlock (but almost every method has logging in its 
> critical sections)
> This method is synchronized and it's spitting something out to Log.info every 
> block stored. Normally not a big deal, but since this is in the name node and 
> these are critical sections...
> We shouldn't even do any logging at all in critical sections, so even the 
> info and warn are bad.  But, in many places in the code, it would be hard to 
> tease these out (although eventually they really should be), but the system 
> could start using something like an AsyncAppender and see how it improves 
> performance. 
> Even though the log may have a buffer, the writing and doing the formatting 
> and stuff cause a drag on performance with 100s/1000s of machines trying to 
> talk to the name node.
> At a minimum, the most often  triggered Log.info could be changed to 
> Log.debug.
> for reference: 
> http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/AsyncAppender.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to