I'm running a pyspark program on a Mesos cluster and seeing behavior where: * The first three stages run with multiple (10-15) tasks. * The fourth stage runs with only one task. * It is using 10 cpus, which is 5 machines in this configuration * It is very slow
I would like it to use more resources and more are available on the cluster. I've tried setting spark.driver.cores and spark.executor.memory to no avail. Can someone suggest (a) how I go about debugging why this is happening in the first place (b) how to configure it to use more resources? If this is answered elsewhere, I was unable to find it and would appreciate a link. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-on-Mesos-Scaling-tp24347.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org