[
https://issues.apache.org/jira/browse/MAPREDUCE-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178634#comment-13178634
]
Patrick Wendell commented on MAPREDUCE-3210:
--------------------------------------------
Just to be clear what I mean:
The current approach is to only schedule "any" requests once the scheduler has
failed to allocate a node or rack local container anywhere for several NM
check-ins. The corresponding approach for rack-locality is to only schedule
rack-local once we've had a given number of global failures scheduling
node-local requests.
My concerns are:
1) If the scheduler falls back onto rack-locality, it might fulfil a request
for a rack-local container which has already been taken care of via a
node-local request. This will be returned to the AM which will have no use for
it and release the container. It might take number of rounds of offers to the
AM for things to shake out correctly.
2) If a single rack is busy, it might take a long time to trigger the global
failover to "any" requests.
Anyways, maybe these won't be a big deal. The first step is to just go ahead
and do this and see how good of an approximation it is for a model where we
have associations between resource requests.
> Support delay scheduling for node locality in MR2's capacity scheduler
> ----------------------------------------------------------------------
>
> Key: MAPREDUCE-3210
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3210
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
>
> The capacity scheduler in MR2 doesn't support delay scheduling for achieving
> node-level locality. So, jobs exhibit poor data locality even if they have
> good rack locality. Especially on clusters where disk throughput is much
> better than network capacity, this hurts overall job performance. We should
> optionally support node-level delay scheduling heuristics similar to what the
> fair scheduler implements in MR1.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira