Is it possible to reduce the number of edge partitions and exploit parallelism fully at the same time? For example, one partition per node, and the threads in the same node share the same partition.
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/configuration-needed-to-run-twitter-25GB-dataset-tp11044p11126.html Sent from the Apache Spark User List mailing list archive at Nabble.com.