GitHub user JonathanTaws opened a pull request:
https://github.com/apache/spark/pull/15405
[SPARK-15917][CORE] Added support for number of executors in Standalone
[WIP]
## What changes were proposed in this pull request?
Currently in standalone mode it is not possible to set the number of
executors by using the `--num-executors` or `spark.executor.instances`
property. Instead, as many executors as possible will be spawned based on the
available resources and the properties set.
This patch corrects that to support the number of executors property.
Here's the new behavior :
- If the `executor.cores` property isn't set, we will try to spawn one
executor on each worker taking all of the cores available (like the default
value) while the number of workers < number of executors requested. If we can't
launch the specified number of executors, a warning is logged.
- If the `executor.cores` property is set (repeat the same logic for
`executor.memory`):
- and `executor.instances` * `executor.cores` <= `cores.max`, then
`executor.instances` will be spawned,
- and `executor.instances` * `executor.cores` > `cores.max`, then as
many executors will be spawned as it is possible - basically the previous
behavior when only executor.cores was set - but we also log a warning saying we
couldn't spawn the requested number of executors,
In the case where `executor.memory` is set, all constraints are taken into
account based on the number of cores and memory per worker assigned (same logic
as with the cores).
## How was this patch tested?
I tested this patch by running a simple Spark app in standalone mode and
specifying the `--num-executors` or `spark.executor.instances property`, and
checking if the number of executors was coherent based on the available
resources and the requested number of executors.
I plan on testing this patch by adding tests in `MasterSuite` and running
the usual `/dev/run-tests`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JonathanTaws/spark SPARK-15917
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15405.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15405
----
commit f45a6732e30c7ae374089d5a63146a03b7a40671
Author: Jonathan Taws <[email protected]>
Date: 2016-06-24T10:23:33Z
[SPARK-15917] Added support for number of executors for Standalone mode
commit d0b1a71cc1106413fdedcc1c658aa4830b1122f0
Author: JonathanTaws <[email protected]>
Date: 2016-10-04T13:32:03Z
[SPARK-15917] Added warning message if requested number of executors can't
be satisfied
commit 0af7b10c42c73d8ee9a0e49e9b652946274d1bae
Author: JonathanTaws <[email protected]>
Date: 2016-10-09T10:43:29Z
Added check on number of workers to avoid displaying the same message
multiple times
commit eed3ecd91e3c84c0e17c513e4b48b92f6b1532f0
Author: JonathanTaws <[email protected]>
Date: 2016-10-09T12:30:57Z
Improved check on num executors warning message
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]