Hi, I was running the SparkPi example code and studying their performance difference using different number of cores per worker. I change the number of cores by using start-slave.sh -c CORES on the worker machine for distributed computation. I also use spark-submit --master local[CORES] for the same effect on local machine. The following table shows the preliminary timings (in seconds) of the SparkPi running locally and on multiple nodes (one worker per node).
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n25526/Screenshot_2015-11-30_22.png> The result is really interesting as it shows that performance is better if we have more workers, but it gets worse if we are using more cores. This confuses me and the only answer I can think of right now is the lack of multi-threading support for reudce() function used in the code. Any input is appreciated. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkPi-running-slower-with-more-cores-on-each-worker-tp25526.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org