Allow a load difference in fairshare scheduler
----------------------------------------------
Key: MAPREDUCE-936
URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: contrib/fair-share
Reporter: Zheng Shao
The problem we are facing: It takes a long time for all tasks of a job to get
scheduled on the cluster, even if the cluster is almost empty.
There are two reasons that together lead to this situation:
1. The load factor makes sure each TT runs the same number of tasks. (This is
the part that this patch tries to change).
2. The scheduler tries to schedule map tasks locally (first node-local, then
rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and
mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf),
and accumulated wait time (JobInfo.localityWait). The accumulated wait time is
reset to 0 whenever a non-local map task is scheduled. That means it takes N *
wait_time to schedule N non-local map tasks.
Because of 1, a lot of TT will not be able to take more tasks, even if they
have free slots. As a result, a lot of the map tasks cannot be scheduled
locally.
Because of 2, it's really hard to schedule a non-local task.
As a result, sometimes we are seeing that it takes more than 2 minutes to
schedule all the mappers of a job.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.