[
https://issues.apache.org/jira/browse/TEZ-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715597#comment-14715597
]
Siddharth Seth commented on TEZ-2744:
-------------------------------------
h6. AM cleanup (remove / move to debug / consolidate into fewer logs)
Suppress RackResolver lines (this shows up multiple times for the same host
which may indicate that caching is not being used)
Task state transitions
TaskAttempt state transitions
TaskAttemptImpl - logging of TaskSpec in AM
TaskSchedulerEventHandler - Processing the event EventType:
S_TA_LAUNCH_REQUEST, and similar lines
AMContainerImpl - extremely chatty. Multiple messages per container
(launchRequest, assignment, etc).
ContainerLauncherImpl - Multiple lines for each container launch, instead of
just failures
TaskAttempt:attempt_1439860407967_0104_1_00_000011_0 sent events: (0-1) ... |
This is a useful message. Can be consolidated to log less often. Possibly
timed, or on every n events sent.
PerSourceNodeTracker - logging each node being added.
YarnTaskSchedulerService: Logs all task allocation requests. | Log stats and
pending requests occasionally including per priority info.
YarnTaskSchedulerService: Logs all assignments, with a long log line | Should
likely remain. Links tasks to containers, and is generally useful for
debugging. Could be shortened.
ContainerReporter asking for new task - is logged far too often. Every 2
seconds. That's 150 repeated log lines for a 5 minute idle session. Log only
once, or if containers can be informed that a session has completed, this can
log only while a dag is executing.
Too much info logged about each event that is being fetched - at least 2 lines
per source event, which can lead to large logs on wide jobs. This could be
consolidated.
h6. AM improvements
Log scheduler stats occasionally, including the pending allocations, available
resources, containers being used, idle containers
Log Vertex specific TaskSpec once. Customized bits per task are in task logs -
including full CLC log line if possible
More to be added ...
h6. Task cleanup (remove / move to debug / consolidate)
MemoryDistributor logs quite a bit. Consolidate into fewer lines.
Far too many log lines stating Initializing Task, Processor, Input, with Spec |
Initialized | Creating | ... Consolidate
TaskSpec - consolidate to only include whatever has changed compared to the
Vertex level spec | This could be avoided since it's a useful log line
TaskReporter logging of #events received in heartbeats - consolidate to be
timed / at an interval | or move to DEBUG
h6, Task improvements
Components to log significant configs, if they're not already doing this
MRPartitioner should log actual partitioner class being used
Include relevant Input / Output name in log lines (as part of thread which
exists for a lot of cases, or directly). e.g. input.MRInput: Using New
mapreduce API: false, split information via event: true
Localizing additional resources - only if there are new resources
Similarly for credentials
h6. General
To make parsing easier. Avoid characters used in the default log4j config line
in thread names, etc. e.g. ":" which is used in time, "|", "[", "]"
h6. Other possible improvements
Support for changing the log level of the AM at runtime - via an external call
(webservice?)
Support dumping out a log for a running AM - which collects and logs
information which may otherwise not be logged.
Comments ?
> Logging cleanup and enhancements
> --------------------------------
>
> Key: TEZ-2744
> URL: https://issues.apache.org/jira/browse/TEZ-2744
> Project: Apache Tez
> Issue Type: Improvement
> Affects Versions: 0.5.0
> Reporter: Siddharth Seth
>
> There's far too much logging, some of which is redundant information - which
> leads to large log files for jobs.
> Also, some logging can be enhanced to include additional context.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)