GitHub user zhzhan opened a pull request:
https://github.com/apache/spark/pull/15218
[Spark-17637][Scheduler]Packed scheduling for Spark tasks across executors
## What changes were proposed in this pull request?
Restructure the code and implement two new task assigner.
PackedAssigner: try to allocate tasks to the executors with least available
cores, so that spark can release reserved executors when dynamic allocation is
enabled.
BalancedAssigner: try to allocate tasks to the executors with more
available cores in order to balance the workload across all executors.
By default, the original round robin assigner is used.
We test a pipeline, and new PackedAssigner save around 45% regarding the
reserved cpu and memory with dynamic allocation enabled.
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
Both unit test in TaskSchedulerImplSuite and manual tests in production
pipeline.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zhzhan/spark packed-scheduler
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15218.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15218
----
commit 97ee760f8acdacf73e7b8c9a1c65578821efb05c
Author: Zhan Zhang <[email protected]>
Date: 2016-09-18T23:16:22Z
enable multiple task-worker allocation scheduling
commit 3f094cf25a6bb7cb50365d47cd00fb84340d8c6c
Author: Zhan Zhang <[email protected]>
Date: 2016-09-18T23:21:09Z
fix the configuration.md
commit c3ebf9ca84f23d7c150cd1abc69955a7a62678ba
Author: Zhan Zhang <[email protected]>
Date: 2016-09-23T03:23:38Z
formatting and change test cases
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]