Hi, I was running the SparkPi example code and studying their performance
difference using different number of cores per worker. I change the number
of cores by using start-slave.sh -c CORES on the worker machine for
distributed computation. I also use spark-submit --master local[CORES] for
the same effect on local machine. The following table shows the preliminary
timings (in seconds) of the SparkPi running locally and on multiple nodes
(one worker per node). 

<http://apache-spark-user-list.1001560.n3.nabble.com/file/n25526/Screenshot_2015-11-30_22.png>
 

The result is really interesting as it shows that performance is better if
we have more workers, but it gets worse if we are using more cores. This
confuses me and the only answer I can think of right now is the lack of
multi-threading support for reudce() function used in the code.
Any input is appreciated.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkPi-running-slower-with-more-cores-on-each-worker-tp25526.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to