[jira] Commented: (HADOOP-76) Implement speculative re-execution of reduces

Sanjay Dahiya (JIRA) Wed, 11 Oct 2006 06:37:03 -0700

    [ 
http://issues.apache.org/jira/browse/HADOOP-76?page=comments#action_12441456 ] 
            
Sanjay Dahiya commented on HADOOP-76:
-------------------------------------


Here is a list of code level changes, I will test this stuff meanwhile

- Adding extra jobConf configuration - runSpeculativeReduces. 

- TaskInProgress maintains a list of nodes where it has already ran ( or is 
running ), this will be used to not schedule a speculative instance where the 
task is already running or has failed in past. [TIP already contains a list of 
nodes where it task failed ]. 

- Another option is if *any* reduce task is already assigned to this TT and is 
still running, then its not assigned a speculative task. [comments?]

- TIP.hasSpeculative task , now checks for reduce tasks as well. currently it 
checks for only map tasks. The exact condition(timeouts) in which reduce task 
should be executed speculatively is open for discussion. using johan's 
conditions(finishedReduces / numReduceTasks >= 0.7 ) for testing till then. 

- JobInProgress.findNewTask - looks for speculative tasks 
(TIP.hasSpeculativeTask()) and whether the task ran on same task tracker. 

- If speculative execution of reduce is enabled then ReduceTask.run() creates a 
temp file name for reduce output. When reduce task finishes it checks if the 
output file is already written by some other reduce instance else it renames 
its output to final output. otherwise temp output is deleted. 

- TaskTracker.TIP.cleanup() also cleans up the reduce task temp file if it is 
killed in between. 

- JobTracker.pollForTaskWithClosedJob(), TIP.shouldCloseForClosedJob() - return 
true if a speculative reduce task finished first, which ultimately goes down to 
TT and kills/cleans up the task.

The exact condition(timeouts) in which reduce task should be executed 
speculatively is open for discussion. 

comments? 

> Implement speculative re-execution of reduces
> ---------------------------------------------
>
>                 Key: HADOOP-76
>                 URL: http://issues.apache.org/jira/browse/HADOOP-76
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.1.0
>            Reporter: Doug Cutting
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>         Attachments: spec_reducev.patch
>
>
> As a first step, reduce task outputs should go to temporary files which are 
> renamed when the task completes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-76) Implement speculative re-execution of reduces

Reply via email to