Global scheduling in the Fair Scheduler
---------------------------------------

                 Key: HADOOP-4667
                 URL: https://issues.apache.org/jira/browse/HADOOP-4667
             Project: Hadoop Core
          Issue Type: New Feature
          Components: contrib/fair-share
            Reporter: Matei Zaharia


The current schedulers in Hadoop all examine a single job on every heartbeat 
when choosing which tasks to assign, choosing the job based on FIFO or fair 
sharing. There are inherent limitations to this approach. For example, if the 
job at the front of the queue is small (e.g. 10 maps, in a cluster of 100 
nodes), then on average it will launch only one local map on the first 10 
heartbeats while it is at the head of the queue. This leads to very poor 
locality for small jobs. Instead, we need a more "global" view of scheduling 
that can look at multiple jobs. To resolve the locality problem, we will use 
the following algorithm:
- If the job at the head of the queue has no local task to launch, skip it and 
look through other jobs.
- If a job has been skipped for at least T seconds while waiting for a local 
task, stop skipping it and allow it to launch non-local tasks.
- If no job can launch a task at all, return to the head of the queue and 
launch a non-local task from the first job.
This algorithm improves locality while bounding the delay that any job 
experiences in launching a task.

We will actually provide two values of T - one for data-local tasks and a 
longer wait for rack-local tasks. It also turns out that whether waiting is 
useful depends on how many tasks are left in the job - the probability of 
getting a heartbeat from a node with a local task. Thus there may be logic for 
removing the wait on the last few tasks in the job.

As a related issue, once we allow global scheduling, we can launch multiple 
tasks per heartbeat, as in HADOOP-3136. The initial implementation of 
HADOOP-3136 adversely affected performance because it only launched multiple 
tasks from the same job, but with the wait rule above, we will only do this for 
jobs that are allowed to launch non-local tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to