[
https://issues.apache.org/jira/browse/MAPREDUCE-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414189#comment-15414189
]
Karthik Kambatla commented on MAPREDUCE-6242:
---------------------------------------------
Was just made aware of this, by way of MAPREDUCE-6740. Thanks for adding this
config.
Thinking more about it, I wonder if the default should be a fraction of the
task.timeout, say a tenth. By default, all longer tasks would then report
progress every minute. Users could just change one config without thinking
about the other. What do others think?
> Progress report log is incredibly excessive in application master
> -----------------------------------------------------------------
>
> Key: MAPREDUCE-6242
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6242
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: applicationmaster
> Affects Versions: 2.4.0
> Reporter: Jian Fang
> Assignee: Varun Saxena
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6242.001.patch, MAPREDUCE-6242.002.patch,
> MAPREDUCE-6242.003.patch, MAPREDUCE-6242.branch-2.patch
>
>
> We saw incredibly excessive logs in application master for a long running one
> with many task attempts. The log write rate is around 1MB/sec in some cases.
> Most of the log entries were from the progress report such as the following
> ones.
> 2015-02-03 17:46:14,321 INFO [IPC Server handler 56 on 37661]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1422985365246_0001_m_000000_0 is : 0.15605757
> 2015-02-03 17:46:17,581 INFO [IPC Server handler 2 on 37661]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1422985365246_0001_m_000000_0 is : 0.4108217
> 2015-02-03 17:46:20,426 INFO [IPC Server handler 0 on 37661]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1422985365246_0001_m_000002_0 is : 0.06634143
> 2015-02-03 17:46:20,807 INFO [IPC Server handler 4 on 37661]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1422985365246_0001_m_000000_0 is : 0.55556506
> 2015-02-03 17:46:21,013 INFO [IPC Server handler 6 on 37661]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1422985365246_0001_m_000001_0 is : 0.21723115
> Looks like the report interval is controlled by a hard-coded variable
> PROGRESS_INTERVAL as 3 seconds in class org.apache.hadoop.mapred.Task. We
> should allow users to set the appropriate progress interval for their
> applications.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]