Gera Shegalov created MAPREDUCE-5672:
----------------------------------------
Summary: Provide optional RollingFileAppender for container log4j
(syslog)
Key: MAPREDUCE-5672
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5672
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: mr-am, mrv2
Affects Versions: 2.2.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
This JIRA is an alternative take on YARN-1130
We propose providing an option of using a RollingFileAppender(RFA)-based
implementation of container log appender as means of log size control via
mapreduce.task.userlog.limit.kb.
The idea is to use mapreduce.task.userlog.limit.kb as maximumFileSize of RFA.
In addition yarn.app.mapreduce.container.log.backups (task attempt containers)
and yarn.app.mapreduce.am.log.backups (MR-AM) are passed as maxBackupIndex.
Both current ContainerLogAppender (CLA) and new ContainerRollingLogAppender
(CRLA) co-exist. CLA is the default. CRLA is chosen when
mapreduce.task.userlog.limit.kb > 0 && *.backups > 0.
Pros:
1) CRLA output is visible in UI right away. CLA output with
mapreduce.task.userlog.limit.kb > 0 is not visible until the task attempt
finishes that prevents timely diagnostics.
2) Even with excessive logging and a large mapreduce.task.userlog.limit.kb, no
space is taken from the JVM heap.
3) No UI impact, since YARN is already designed to deal with any log name
beyond stderr/out, syslog, debug.out, profile.out
Cons:
1) if the logging is excessive there will be more local filesystem metadata I/O
due to roll. That should be negligible in the grand scheme.
Furthermore, to improve log consistency and completeness in the case of JVM
crashes and SIGTERMing by NM, we propose to restore the MRv1 behavior of
periodic log syncing (every 5s) and having log sync as part of a shutdown hook.
--
This message was sent by Atlassian JIRA
(v6.1#6144)