[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082512#comment-17082512
 ] 

Eric Payne commented on MAPREDUCE-7272:
---------------------------------------

+1. [~ahussein], thanks for providing fixes for this problem.
[~ebadger], do you want me to commit this?

> TaskAttemptListenerImpl excessive log messages
> ----------------------------------------------
>
>                 Key: MAPREDUCE-7272
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Ahmed Hussein
>            Assignee: Ahmed Hussein
>            Priority: Major
>         Attachments: MAPREDUCE-7272-branch-2.10.001.patch, 
> MAPREDUCE-7272-branch-2.10.002.patch, MAPREDUCE-7272-branch-2.10.003.patch, 
> MAPREDUCE-7272-branch-2.10.004.patch, MAPREDUCE-7272.001.patch, 
> MAPREDUCE-7272.002.patch, MAPREDUCE-7272.003.patch, MAPREDUCE-7272.004.patch
>
>
> {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files. 
> One every call, the listener uses {{LOG.info()}} to printout the progress of 
> the {{TaskAttempt}}.
> {code:java}
> taskAttemptStatus.progress = taskStatus.getProgress();
>     LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : "
>         + taskStatus.getProgress());
> {code}
>  
> {code:bash}
> 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_007783_0 is : 0.40713295
> 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_020681_0 is : 0.55573714
> 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_024371_0 is : 0.54190344
> 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_033182_0 is : 0.50264555
> 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_022375_0 is : 0.5495565
> {code}
> After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we 
> thought that while it is helpful to have a log print of task progress, it is 
> still excessive to log the progress in every update.
>  This Jira is to suppress the excessive logging from TaskAttemptListener 
> without affecting the frequency of progress updates. 
>  There are two flags:
>  * {{-Dmapreduce.task.log.progress.delta.threshold=0.10}}: means that the 
> task progress will be logged every 10% of delta progress. Default is 5%.
>  * {{-Dmapreduce.task.log.progress.wait.interval-seconds=120}}: means that if 
> the listener will log the progress every 2 minutes. This is helpful for long 
> running tasks that take long time to achieve the delta threshold. Default is 
> 1 minute.
> The listener will long whichever of {{delta.threshold}} and 
> {{wait.interval-seconds}} is reached first. 
>    Enabling {{LOG.DEBUG}} for  {{TaskAttemptListenerImpl}} will override 
> those two flags and log the task progress on every update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to