[ 
https://issues.apache.org/jira/browse/HADOOP-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496097
 ] 

Devaraj Das commented on HADOOP-1201:
-------------------------------------

Makes sense. Maybe HADOOP-1235 should be merged with this issue.

> Progress reporting can be improved for both Map/Reduce tasks
> ------------------------------------------------------------
>
>                 Key: HADOOP-1201
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1201
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>
> Both the map and reduce tasks do progress reporting in separate threads. 
> However, in the ReduceTask, after the sort phase, the progress reporting 
> happens inline with the reducer invocations. This slows down the Reduce phase 
> since RPC is involved for every progress report. The better thing to do would 
> be to do progress reporting for all phases in separate threads and have the 
> tasks just update the progress fields.
> One proposal is to extract out the reporting stuff that is there in 
> MapTask/ReduceTask and put it in the Task superclass as a new class, and have 
> methods in the new class that control what/when progress is reported. 
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to