[ 
https://issues.apache.org/jira/browse/SPARK-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348108#comment-15348108
 ] 

Jonathan Taws commented on SPARK-15917:
---------------------------------------

I made a change to the *StandaloneSchedulerBackend* on the initial executor 
limit to take the executor instances property and it seems to be the only 
effective change needed for it to be taken into account. 

This is the current behavior with this change : 
* If the {{executor.cores}} property isn't set, the {{executor.instances}} 
property will be rendered useless as one executor will just take all of the 
cores available
* If the {{executor.cores}} property is set : 
** and {{executor.instances}} * {{executor.cores}} *<=* {{cores.max}}, then 
{{executor.instances}} will be spawned 
** and {{executor.instances}} * {{executor.cores}} *>* {{cores.max}}, then as 
many executors will be spawned as it is possible - basically the previous 
behavior when only {{executor.cores}} was set
** in the case where {{executor.memory}} is set, all constraints are taken into 
account based on the number of cores and memory per worker assigned (e.g. : if 
we requested 3 executors, and each executor has 2 cores and 8gb of memory on a 
16gb and 8 cores worker, we get only 2 executors)   

It looks pretty consistent to me, what do you think ?

I will move on to the exception throwing and doing a few updates to the 
documentation afterwards. 

I have an issue though :
- If we set the number of executors to 0, no executors are allocated and thus 
we can't run any job. Should we add a warning or even throw an error stating 
that no executors were spawned on the worker ? 

> Define the number of executors in standalone mode with an easy-to-use property
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-15917
>                 URL: https://issues.apache.org/jira/browse/SPARK-15917
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, Spark Shell, Spark Submit
>    Affects Versions: 1.6.1
>            Reporter: Jonathan Taws
>            Priority: Minor
>
> After stumbling across a few StackOverflow posts around the issue of using a 
> fixed number of executors in standalone mode (non-YARN), I was wondering if 
> we could not add an easier way to set this parameter than having to resort to 
> some calculations based on the number of cores and the memory you have 
> available on your worker. 
> For example, let's say I have 8 cores and 30GB of memory available :
>  - If no option is passed, one executor will be spawned with 8 cores and 1GB 
> of memory allocated.
>  - However, if I want to have only *2* executors, and to use 2 cores and 10GB 
> of memory per executor, I will end up with *3* executors (as the available 
> memory will limit the number of executors) instead of the 2 I was hoping for.
> Sure, I can set {{spark.cores.max}} as a workaround to get exactly what I 
> want, but would it not be easier to add a {{--num-executors}}-like option to 
> standalone mode to be able to really fine-tune the configuration ? This 
> option is already available in YARN mode.
> From my understanding, I don't see any other option lying around that can 
> help achieve this.  
> This seems to be slightly disturbing for newcomers, and standalone mode is 
> probably the first thing anyone will use to just try out Spark or test some 
> configuration.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to