Hi,

If I perform a sortByKey(true, 2).saveAsTextFile("filename") on a cluster,
will the data be sorted per partition, or in total. (And is this
guaranteed?)

Example:
Input 4,2,3,6,5,7

Sorted per partition:
part-00000: 2,3,7
part-00001: 4,5,6

Sorted in total:
part-00000: 2,3,4 
part-00001: 5,6,7

Thanks,

Tom

P.S. (I know that the data might not end up being uniformly distributed,
example: 4 elements in part-00000 and 2 in part-00001)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/sortByKey-with-multiple-partitions-tp22426.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to