I'm running a pyspark program on a Mesos cluster and seeing behavior where:
* The first three stages run with multiple (10-15) tasks.
* The fourth stage runs with only one task.
* It is using 10 cpus, which is 5 machines in this configuration
* It is very slow

I would like it to use more resources and more are available on the cluster.

I've tried setting spark.driver.cores and spark.executor.memory to no avail.

Can someone suggest (a) how I go about debugging why this is happening in
the first place (b) how to configure it to use more resources?

If this is answered elsewhere, I was unable to find it and would appreciate
a link.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-on-Mesos-Scaling-tp24347.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to