[jira] Commented: (HADOOP-248) locating map outputs via random probing is inefficient

Owen O'Malley (JIRA) Wed, 17 Jan 2007 11:20:51 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465511
 ]


Owen O'Malley commented on HADOOP-248:
--------------------------------------

Sounds good, Devaraj.

The events are per a taskid, not a tipid. So different attempts to run "map 0" 
would result in different events.

That said, however, we probably should make another event "lost" or something 
for tasks that are lost because their output had problems or the task tracker 
was lost.

We may also want to flag the "complete" events of lost tasks as obsolete so 
that reduces don't see them and try and fetch their outputs.

> locating map outputs via random probing is inefficient
> ------------------------------------------------------
>
>                 Key: HADOOP-248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-248
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.2.1
>            Reporter: Owen O'Malley
>         Assigned To: Devaraj Das
>
> Currently the ReduceTaskRunner polls the JobTracker for a random list of map 
> tasks asking for their output locations. It would be better if the JobTracker 
> kept an ordered log and the interface was changed to:
> class MapLocationResults {
>    public int getTimestamp();
>    public MapOutputLocation[] getLocations();
> }
> interface InterTrackerProtocol {
>   ...
>   MapLocationResults locateMapOutputs(int prevTimestamp);
> } 
> with the intention that each time a ReduceTaskRunner calls locateMapOutputs, 
> it passes back the "timestamp" that it got from the previous result. That 
> way, reduces can easily find the new MapOutputs. This should help the "ramp 
> up" when the maps first start finishing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-248) locating map outputs via random probing is inefficient

Reply via email to