When you submit a job, spark breaks down it into stages, as per DAG. the
stages run transformations or actions on the rdd's. Each rdd constitutes of
N partitions. The tasks creates by spark to execute the stage are equal to
 the number of partitions. Every task is executed on the  cored utilized by
the executors in your cluster.

--conf spark.cores.max=24 defines max cores you want to utilize. Spark
itself would distribute the number of cores among the workers.

More the number of partitions and more the cores available -> more the
level of parallelism -> better the performance

On Tue, Jun 16, 2015 at 9:27 AM, shreesh <shreesh.la...@gmail.com> wrote:

> How do I decide in how many partitions I break up my data into, how many
> executors should I have? I guess memory and cores will be allocated based
> on
> the number of executors I have.
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-does-one-decide-no-of-executors-cores-memory-allocation-tp23326.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to