[ 
https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Konwinski updated HADOOP-2141:
-----------------------------------

    Affects Version/s:     (was: 0.15.0)
                       0.19.0
               Status: Patch Available  (was: Open)

This patch implements a new algorithm for speculative execution. 

When a TaskTracker asks for another task, if there are no more tasks which 
haven't been tried at least once, and if there aren't any tasks which have 
failed and need to be re-run, then a speculative task may potentially be 
assigned. There are three configurable thresholds. A speculative task will be 
chosen with the following algorithm:

  1) check to be sure that there are less than SpeculativeCap speculative tasks 
running
  2) Ignore the request if the TaksTracker's progressRate percentile is < 
SlowNodeThreshold
  3) Rank currently running, non-speculative tasks by their estimated time left
  4) Chose the task with highest-ranked progressRate where progessRate < 
SlowTaskThreshold


> speculative execution start up condition based on completion time
> -----------------------------------------------------------------
>
>                 Key: HADOOP-2141
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2141
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Koji Noguchi
>            Assignee: Andy Konwinski
>         Attachments: HADOOP-2141.patch
>
>
> We had one job with speculative execution hang.
> 4 reduce tasks were stuck with 95% completion because of a bad disk. 
> Devaraj pointed out 
> bq . One of the conditions that must be met for launching a speculative 
> instance of a task is that it must be at least 20% behind the average 
> progress, and this is not true here.
> It would be nice if speculative execution also starts up when tasks stop 
> making progress.
> Devaraj suggested 
> bq. Maybe, we should introduce a condition for average completion time for 
> tasks in the speculative execution check. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to