[jira] [Commented] (MAPREDUCE-3210) Support delay scheduling for node locality in MR2's capacity scheduler

Patrick Wendell (Commented) (JIRA) Mon, 02 Jan 2012 21:17:11 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178634#comment-13178634
 ]


Patrick Wendell commented on MAPREDUCE-3210:
--------------------------------------------

Just to be clear what I mean:

The current approach is to only schedule "any" requests once the scheduler has 
failed to allocate a node or rack local container anywhere for several NM 
check-ins. The corresponding approach for rack-locality is to only schedule 
rack-local once we've had a given number of global failures scheduling 
node-local requests.

My concerns are:

1) If the scheduler falls back onto rack-locality, it might fulfil a request 
for a rack-local container which has already been taken care of via a 
node-local request. This will be returned to the AM which will have no use for 
it and release the container. It might take number of rounds of offers to the 
AM for things to shake out correctly.

2) If a single rack is busy, it might take a long time to trigger the global 
failover to "any" requests.

Anyways, maybe these won't be a big deal. The first step is to just go ahead 
and do this and see how good of an approximation it is for a model where we 
have associations between resource requests.
                
> Support delay scheduling for node locality in MR2's capacity scheduler
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3210
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3210
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>
> The capacity scheduler in MR2 doesn't support delay scheduling for achieving 
> node-level locality. So, jobs exhibit poor data locality even if they have 
> good rack locality. Especially on clusters where disk throughput is much 
> better than network capacity, this hurts overall job performance. We should 
> optionally support node-level delay scheduling heuristics similar to what the 
> fair scheduler implements in MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3210) Support delay scheduling for node locality in MR2's capacity scheduler

Reply via email to