When you submit a job, spark breaks down it into stages, as per DAG. the stages run transformations or actions on the rdd's. Each rdd constitutes of N partitions. The tasks creates by spark to execute the stage are equal to the number of partitions. Every task is executed on the cored utilized by the executors in your cluster.
--conf spark.cores.max=24 defines max cores you want to utilize. Spark itself would distribute the number of cores among the workers. More the number of partitions and more the cores available -> more the level of parallelism -> better the performance On Tue, Jun 16, 2015 at 9:27 AM, shreesh <shreesh.la...@gmail.com> wrote: > How do I decide in how many partitions I break up my data into, how many > executors should I have? I guess memory and cores will be allocated based > on > the number of executors I have. > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-does-one-decide-no-of-executors-cores-memory-allocation-tp23326.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >