Support pluggable speculative execution
---------------------------------------

                 Key: HADOOP-3840
                 URL: https://issues.apache.org/jira/browse/HADOOP-3840
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
            Reporter: Matei Zaharia
            Priority: Minor


HADOOP-3412 introduced an way to plug in a job scheduler for MapReduce. 
However, the job schedulers all use JobInProgress.obtainNewMapTask or 
obtainNewReduceTask to select tasks to run from each job, which uses a 
threshold-based speculative execution algorithm that has several shortcomings 
(see JIRAs about the scheduler not speculating tasks that freeze after having 
80% progress for example). As a first step towards supporting better 
speculative execution policies while not breaking backwards compatibility, it 
makes sense to make the speculative execution policy pluggable. Luckily this is 
easy - we just need an interface around obtainNewMapTask and 
obtainNewReduceTask. This JIRA suggests adding a TaskSelector abstract class 
which, given a TaskTracker and a JobInProgress, chooses a task to run on the 
tracker. A default implementation that uses the current methods in 
JobInProgress is provided. Both TaskSchedulers in trunk are changed to use 
TaskSelector.

In addition, there are methods to count how many speculative tasks a job needs, 
since TaskInProgress.hasSpeculative() may not work if we change the algorithm 
for selecting speculative tasks. This count is needed for some schedulers, such 
as a fair scheduler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to