What I mean is, let's say I run this:

sc.parallelize(Seq(0->3, 0->2, 0->1), 3).partitionBy(HashPartitioner(3)).collect


Will the result always be Array((0,3), (0,2), (0,1))? Or could I
possibly get a different order?


I'm pretty sure the shuffle files are taken in the order of the source
partitions... But after much search, and the discussion on
http://stackoverflow.com/questions/24206660/does-groupbykey-in-spark-preserve-the-original-order
I still can't find the code that does this.


Thanks!

Reply via email to