[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853014#comment-13853014
 ] 

Jason Lowe commented on MAPREDUCE-5672:
---------------------------------------

Thanks, Gera!   This looks like nice addition and a practical workaround for 
the lack of log limiting/monitoring at the YARN level.

Patch looks quite good overall, but I'd really like to see the rolling log and 
log syncing changes as separate JIRAs/patches.  They're not related other than 
they both involve logging, and it would be cleaner to review and integrate them 
separately.

Other comments:

- YarnChild has a now unused import of LogManager
- TaskLog has a duplicate StringUtils import
- Should it be yarn.app.mapreduce.container.log.backups or something like 
yarn.app.mapreduce.task.log.backups or 
yarn.app.mapreduce.task.container.log.backups?  The AM often runs in a 
container and therefore the former may be misinterpreted to also cover the AM's 
container.


> Provide optional RollingFileAppender for container log4j (syslog)
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-5672
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5672
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>    Affects Versions: 2.2.0
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5672.v01.patch, MAPREDUCE-5672.v02.patch, 
> MAPREDUCE-5672.v03.patch, MAPREDUCE-5672.v04.patch, Screen Shot 2013-12-05 at 
> 3.21.02 PM.png, Screen Shot 2013-12-05 at 3.23.33 PM.png
>
>
> This JIRA is an alternative take on YARN-1130
> We propose providing an option of using a RollingFileAppender(RFA)-based 
> implementation of container log appender as means of log size control via 
> mapreduce.task.userlog.limit.kb. 
> The idea is to use mapreduce.task.userlog.limit.kb as maximumFileSize of RFA. 
> In addition yarn.app.mapreduce.container.log.backups (task attempt 
> containers) and yarn.app.mapreduce.am.log.backups (MR-AM) are passed as 
> maxBackupIndex.
> Both current ContainerLogAppender (CLA) and new ContainerRollingLogAppender 
> (CRLA) co-exist. CLA is the default. CRLA is chosen when  
> mapreduce.task.userlog.limit.kb > 0 && *.backups > 0.
> Pros: 
> 1) CRLA output is visible in UI right away. CLA output with 
> mapreduce.task.userlog.limit.kb > 0 is not visible until the task attempt 
> finishes that prevents timely diagnostics. 
> 2) Even with excessive logging and a large mapreduce.task.userlog.limit.kb, 
> no space is taken from the JVM heap.
> 3) No UI impact, since YARN is already designed to deal with any log name 
> beyond stderr/out, syslog, debug.out, profile.out
> Cons:
> 1) if the logging is excessive there will be more local filesystem metadata 
> I/O due to roll. That should be negligible in the grand scheme.
> Furthermore, to improve log consistency and completeness in the case of JVM 
> crashes and SIGTERMing by NM, we propose to restore the MRv1 behavior of 
> periodic log syncing (every 5s) and having log sync as part of a shutdown 
> hook.
>  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to