GitHub user squito opened a pull request:

    https://github.com/apache/spark/pull/16354

    [SPARK-18886][Scheduler][WIP] Adjust Delay scheduling to prevent 
under-utilization of cluster

    ## What changes were proposed in this pull request?
    
    This is a significant change to delay scheduling to avoid under-utilization 
of cluster resources when there are locality preferences for a subset of 
resources.  The main change here is that the delay is no longer reset when any 
task is scheduled at a tighter locality constraint.  Instead, each task set 
starts the locality timer the first time it fails to utilize a resource offer 
due to locality constraints.  One task set *never* tightens the locality 
constraints, even if subsequent offers are made that utilize tighter 
constraints.
    
    A more complete description of the issues w/ the previous scheduling method 
can be found under the jira.
    
    ## How was this patch tested?
    
    Added unit test for original issue.  Ran all unit tests in 
o.a.s.scheduler.* manually.  Full tests via jenkins.
    
    TODO
    * [ ] add more unit tests, especially for recompute locality levels.
    * [ ] code cleanup (especially all the logging added)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/squito/spark delay_sched-SPARK-18886

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16354.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16354
    
----
commit 8629823dcb61f67207ee5b6a6a1789a4c38e898f
Author: Imran Rashid <[email protected]>
Date:   2016-12-15T17:28:48Z

    failing test case

commit 348a9f44a6f34e6ac15f4ece70b0178d134d0cc3
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T03:35:55Z

    "working" version -- but this is actually a significant departure from old 
delay scheduling

commit 8b7fd1adf510ef15a7c30aebdd4f029e71e2e50f
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T03:57:44Z

    test update

commit 22086999a9644086d6787fc4db5b2367e6ba70fe
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T04:18:40Z

    fix condition

commit af88dd8f8942b12edda2a466b292dd3bccdfbc4e
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T04:19:05Z

    update tests to reflect change in delay scheduling behavior

commit 27983a9a6d3f7d675bbfa83eb116e8329869aed7
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T04:19:19Z

    logging

commit 647bf400a0963ff8f5381e47f895e3cc606aa854
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T17:13:59Z

    fix other test cases, more fixes to recomputeLocality()

commit 2e5307f971767c0d5a228e3058db31244351da2d
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T17:47:02Z

    Merge branch 'master' into delay_sched-SPARK-18886

commit 449ba20c9c642884f5dcc5feccfa64cb1da833f2
Author: Imran Rashid <[email protected]>
Date:   2016-12-20T17:47:11Z

    remove TODO

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to