[ 
https://issues.apache.org/jira/browse/TEZ-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715597#comment-14715597
 ] 

Siddharth Seth edited comment on TEZ-2744 at 8/26/15 10:00 PM:
---------------------------------------------------------------

h6. AM cleanup (remove / move to debug / consolidate into fewer logs)
Suppress RackResolver lines (this shows up multiple times for the same host 
which may indicate that caching is not being used)
Task state transitions
TaskAttempt state transitions
TaskAttemptImpl - logging of TaskSpec in AM
TaskSchedulerEventHandler - Processing the event EventType: 
S_TA_LAUNCH_REQUEST, and similar lines
AMContainerImpl - extremely chatty. Multiple messages per container 
(launchRequest, assignment, etc).
ContainerLauncherImpl - Multiple lines for each container launch, instead of 
just failures
TaskAttempt:attempt_1439860407967_0104_1_00_000011_0 sent events: (0-1) ... | 
This is a useful message. Can be consolidated to log less often. Possibly 
timed, or on every n events sent.
PerSourceNodeTracker - logging each node being added.
YarnTaskSchedulerService: Logs all task allocation requests. | Log stats and 
pending requests occasionally including per priority info.
YarnTaskSchedulerService: Logs all assignments, with a long log line | Should 
likely remain. Links tasks to containers, and is generally useful for 
debugging. Could be shortened.
ContainerReporter asking for new task - is logged far too often. Every 2 
seconds. That's 150 repeated log lines for a 5 minute idle session. Log only 
once, or if containers can be informed that a session has completed, this can 
log only while a dag is executing.
Too much info logged about each event that is being fetched - at least 2 lines 
per source event, which can lead to large logs on wide jobs. This could be 
consolidated.

h6. AM improvements
Log scheduler stats occasionally, including the pending allocations, available 
resources, containers being used, idle containers
Log Vertex specific TaskSpec once. Customized bits per task are in task logs - 
including full CLC log line if possible
More to be added ... 

h6. Task cleanup (remove / move to debug / consolidate)
MemoryDistributor logs quite a bit. Consolidate into fewer lines.
Far too many log lines stating Initializing Task, Processor, Input, with Spec | 
Initialized | Creating | ... Consolidate
TaskSpec - consolidate to only include whatever has changed compared to the 
Vertex level spec | This could be avoided since it's a useful log line
TaskReporter logging of #events received in heartbeats - consolidate to be 
timed / at an interval | or move to DEBUG


h6, Task improvements
Components to log significant configs, if they're not already doing this
MRPartitioner should log actual partitioner class being used
Include relevant Input / Output name in log lines (as part of thread which 
exists for a lot of cases, or directly). e.g. input.MRInput: Using New 
mapreduce API: false, split information via event: true



Localizing additional resources - only if there are new resources
Similarly for credentials


h6. General
To make parsing easier. Avoid characters used in the default log4j config line 
in thread names, etc. e.g. ":" which is used in time, "|", "[", "]"

h6. Other possible improvements
Support for changing the log level of the AM at runtime - via an external call 
(webservice?)
Support dumping out a log for a running AM - which collects and logs 
information which may otherwise not be logged.

DEBUG level logging still needs to be looked at, but the default log level when 
enabling debug should typically turn off hadoop IPC logging and HDFS logging, 
which is extremely noisy.

Comments ?


was (Author: sseth):
h6. AM cleanup (remove / move to debug / consolidate into fewer logs)
Suppress RackResolver lines (this shows up multiple times for the same host 
which may indicate that caching is not being used)
Task state transitions
TaskAttempt state transitions
TaskAttemptImpl - logging of TaskSpec in AM
TaskSchedulerEventHandler - Processing the event EventType: 
S_TA_LAUNCH_REQUEST, and similar lines
AMContainerImpl - extremely chatty. Multiple messages per container 
(launchRequest, assignment, etc).
ContainerLauncherImpl - Multiple lines for each container launch, instead of 
just failures
TaskAttempt:attempt_1439860407967_0104_1_00_000011_0 sent events: (0-1) ... | 
This is a useful message. Can be consolidated to log less often. Possibly 
timed, or on every n events sent.
PerSourceNodeTracker - logging each node being added.
YarnTaskSchedulerService: Logs all task allocation requests. | Log stats and 
pending requests occasionally including per priority info.
YarnTaskSchedulerService: Logs all assignments, with a long log line | Should 
likely remain. Links tasks to containers, and is generally useful for 
debugging. Could be shortened.
ContainerReporter asking for new task - is logged far too often. Every 2 
seconds. That's 150 repeated log lines for a 5 minute idle session. Log only 
once, or if containers can be informed that a session has completed, this can 
log only while a dag is executing.
Too much info logged about each event that is being fetched - at least 2 lines 
per source event, which can lead to large logs on wide jobs. This could be 
consolidated.

h6. AM improvements
Log scheduler stats occasionally, including the pending allocations, available 
resources, containers being used, idle containers
Log Vertex specific TaskSpec once. Customized bits per task are in task logs - 
including full CLC log line if possible
More to be added ... 

h6. Task cleanup (remove / move to debug / consolidate)
MemoryDistributor logs quite a bit. Consolidate into fewer lines.
Far too many log lines stating Initializing Task, Processor, Input, with Spec | 
Initialized | Creating | ... Consolidate
TaskSpec - consolidate to only include whatever has changed compared to the 
Vertex level spec | This could be avoided since it's a useful log line
TaskReporter logging of #events received in heartbeats - consolidate to be 
timed / at an interval | or move to DEBUG


h6, Task improvements
Components to log significant configs, if they're not already doing this
MRPartitioner should log actual partitioner class being used
Include relevant Input / Output name in log lines (as part of thread which 
exists for a lot of cases, or directly). e.g. input.MRInput: Using New 
mapreduce API: false, split information via event: true



Localizing additional resources - only if there are new resources
Similarly for credentials


h6. General
To make parsing easier. Avoid characters used in the default log4j config line 
in thread names, etc. e.g. ":" which is used in time, "|", "[", "]"

h6. Other possible improvements
Support for changing the log level of the AM at runtime - via an external call 
(webservice?)
Support dumping out a log for a running AM - which collects and logs 
information which may otherwise not be logged.


Comments ?

> Logging cleanup and enhancements
> --------------------------------
>
>                 Key: TEZ-2744
>                 URL: https://issues.apache.org/jira/browse/TEZ-2744
>             Project: Apache Tez
>          Issue Type: Improvement
>    Affects Versions: 0.5.0
>            Reporter: Siddharth Seth
>
> There's far too much logging, some of which is redundant information - which 
> leads to large log files for jobs.
> Also, some logging can be enhanced to include additional context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to