[ 
https://issues.apache.org/jira/browse/HADOOP-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496070
 ] 

Owen O'Malley commented on HADOOP-1201:
---------------------------------------

I think an interesting way of doing it would be to have the "ping" thread 
sending status instead. So the Reporter would update the values in memory and 
the ping thread would send them up once a second. The status would include a 
counter that would get incremented when the application called progress. The 
task tracker would then detect differences to determine if the application is 
making progress.

> Progress reporting can be improved for both Map/Reduce tasks
> ------------------------------------------------------------
>
>                 Key: HADOOP-1201
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1201
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>
> Both the map and reduce tasks do progress reporting in separate threads. 
> However, in the ReduceTask, after the sort phase, the progress reporting 
> happens inline with the reducer invocations. This slows down the Reduce phase 
> since RPC is involved for every progress report. The better thing to do would 
> be to do progress reporting for all phases in separate threads and have the 
> tasks just update the progress fields.
> One proposal is to extract out the reporting stuff that is there in 
> MapTask/ReduceTask and put it in the Task superclass as a new class, and have 
> methods in the new class that control what/when progress is reported. 
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to