Hi, I read the past posts about partition number, but am still a little confused about partitioning strategy.
I have a cluster with 8 works and 2 cores for each work. Is it true that the optimal partition number should be 2-4 * total_coreNumber or should approximately equal to total_coreNumber? Or it's the task number that really determines the speed rather then partition number? Thanks a lot! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/about-partition-number-tp15362.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org