[jira] [Updated] (MAPREDUCE-5672) Provide optional RollingFileAppender for container log4j (syslog)

Jason Lowe (JIRA) Thu, 09 Jan 2014 14:34:57 -0800

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jason Lowe updated MAPREDUCE-5672:
----------------------------------

    Attachment: MAPREDUCE-5672.v07.patch

Thanks for the explanation, Gera.  Originally I was thinking this would be used 
to set relatively large limits to keep the entirety of most reasonable logs but 
avoid having runaway logging fill disks and cause troubles that way.  However I 
can see a lot of utility in tuning tasks logs to something relatively short 
(e.g.: to reduce log aggregation storage pressure or retain more non-aggregated 
logs on nodes) but still wanting to see the entirety (or at least a lot more) 
of the AM log in case there were issues early on in the job that only 
manifested much later.

You've convinced me that these really should be separately tunable.  I took the 
liberty of updating the latest patch to put back in the AM-specific config, 
using the proposed name from your earlier comment.  I also fixed a bug in the 
previous patch where the code expected the property to be 
yarn.app.mapreduce.container.log.backups but mapred-default.xml documented 
yarn.app.mapreduce.task.container.log.backups.  The code now uses the latter 
for tasks and yarn.app.mapreduce.am.container.log.backups for the AM.

> Provide optional RollingFileAppender for container log4j (syslog)
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-5672
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5672
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>    Affects Versions: 2.2.0
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5672.v01.patch, MAPREDUCE-5672.v02.patch, 
> MAPREDUCE-5672.v03.patch, MAPREDUCE-5672.v04.patch, MAPREDUCE-5672.v05.patch, 
> MAPREDUCE-5672.v06.patch, MAPREDUCE-5672.v07.patch, Screen Shot 2013-12-05 at 
> 3.21.02 PM.png, Screen Shot 2013-12-05 at 3.23.33 PM.png
>
>
> This JIRA is an alternative take on YARN-1130
> We propose providing an option of using a RollingFileAppender(RFA)-based 
> implementation of container log appender as means of log size control via 
> mapreduce.task.userlog.limit.kb. 
> The idea is to use mapreduce.task.userlog.limit.kb as maximumFileSize of RFA. 
> In addition yarn.app.mapreduce.container.log.backups (task attempt 
> containers) and yarn.app.mapreduce.am.log.backups (MR-AM) are passed as 
> maxBackupIndex.
> Both current ContainerLogAppender (CLA) and new ContainerRollingLogAppender 
> (CRLA) co-exist. CLA is the default. CRLA is chosen when  
> mapreduce.task.userlog.limit.kb > 0 && *.backups > 0.
> Pros: 
> 1) CRLA output is visible in UI right away. CLA output with 
> mapreduce.task.userlog.limit.kb > 0 is not visible until the task attempt 
> finishes that prevents timely diagnostics. 
> 2) Even with excessive logging and a large mapreduce.task.userlog.limit.kb, 
> no space is taken from the JVM heap.
> 3) No UI impact, since YARN is already designed to deal with any log name 
> beyond stderr/out, syslog, debug.out, profile.out
> Cons:
> 1) if the logging is excessive there will be more local filesystem metadata 
> I/O due to roll. That should be negligible in the grand scheme.
> Furthermore, to improve log consistency and completeness in the case of JVM 
> crashes and SIGTERMing by NM, we propose to restore the MRv1 behavior of 
> periodic log syncing (every 5s) and having log sync as part of a shutdown 
> hook.
>  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (MAPREDUCE-5672) Provide optional RollingFileAppender for container log4j (syslog)

Reply via email to