[
https://issues.apache.org/jira/browse/MAPREDUCE-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178627#comment-13178627
]
Patrick Wendell commented on MAPREDUCE-3210:
--------------------------------------------
I'm going to be addressing this as part of MAPREDUCE-3601 and can probably just
add to the Capacity scheduler as well.
Delay scheduling is going to be less efficient in MR2 due to the resource
request model. Right now, when a map task needs to run, the MR AM creates three
separate resource requests to the scheduler, one for a node-local container,
one for a rack-local container, and another for an *any* container. However,
the scheduler can't associate these in any way.
In the MR1 Fair scheduler, we basically triage a given request and accept worse
levels of locality as time goes on - this won't be possible. In MR2, I don't
see a better way than introducing some type of global delay for "any" requests
and rack-local requests (the former exists already). It seems like this could
lead to undesirable behaviour depending on the order of resource request
arrivals.
> Support delay scheduling for node locality in MR2's capacity scheduler
> ----------------------------------------------------------------------
>
> Key: MAPREDUCE-3210
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3210
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
>
> The capacity scheduler in MR2 doesn't support delay scheduling for achieving
> node-level locality. So, jobs exhibit poor data locality even if they have
> good rack locality. Especially on clusters where disk throughput is much
> better than network capacity, this hurts overall job performance. We should
> optionally support node-level delay scheduling heuristics similar to what the
> fair scheduler implements in MR1.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira