[
https://issues.apache.org/jira/browse/SPARK-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348108#comment-15348108
]
Jonathan Taws commented on SPARK-15917:
---------------------------------------
I made a change to the *StandaloneSchedulerBackend* on the initial executor
limit to take the executor instances property and it seems to be the only
effective change needed for it to be taken into account.
This is the current behavior with this change :
* If the {{executor.cores}} property isn't set, the {{executor.instances}}
property will be rendered useless as one executor will just take all of the
cores available
* If the {{executor.cores}} property is set :
** and {{executor.instances}} * {{executor.cores}} *<=* {{cores.max}}, then
{{executor.instances}} will be spawned
** and {{executor.instances}} * {{executor.cores}} *>* {{cores.max}}, then as
many executors will be spawned as it is possible - basically the previous
behavior when only {{executor.cores}} was set
** in the case where {{executor.memory}} is set, all constraints are taken into
account based on the number of cores and memory per worker assigned (e.g. : if
we requested 3 executors, and each executor has 2 cores and 8gb of memory on a
16gb and 8 cores worker, we get only 2 executors)
It looks pretty consistent to me, what do you think ?
I will move on to the exception throwing and doing a few updates to the
documentation afterwards.
I have an issue though :
- If we set the number of executors to 0, no executors are allocated and thus
we can't run any job. Should we add a warning or even throw an error stating
that no executors were spawned on the worker ?
> Define the number of executors in standalone mode with an easy-to-use property
> ------------------------------------------------------------------------------
>
> Key: SPARK-15917
> URL: https://issues.apache.org/jira/browse/SPARK-15917
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, Spark Shell, Spark Submit
> Affects Versions: 1.6.1
> Reporter: Jonathan Taws
> Priority: Minor
>
> After stumbling across a few StackOverflow posts around the issue of using a
> fixed number of executors in standalone mode (non-YARN), I was wondering if
> we could not add an easier way to set this parameter than having to resort to
> some calculations based on the number of cores and the memory you have
> available on your worker.
> For example, let's say I have 8 cores and 30GB of memory available :
> - If no option is passed, one executor will be spawned with 8 cores and 1GB
> of memory allocated.
> - However, if I want to have only *2* executors, and to use 2 cores and 10GB
> of memory per executor, I will end up with *3* executors (as the available
> memory will limit the number of executors) instead of the 2 I was hoping for.
> Sure, I can set {{spark.cores.max}} as a workaround to get exactly what I
> want, but would it not be easier to add a {{--num-executors}}-like option to
> standalone mode to be able to really fine-tune the configuration ? This
> option is already available in YARN mode.
> From my understanding, I don't see any other option lying around that can
> help achieve this.
> This seems to be slightly disturbing for newcomers, and standalone mode is
> probably the first thing anyone will use to just try out Spark or test some
> configuration.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]