It should keep them in order, but what kind of collection do you have? Maybe toArray changes the order.
Matei On Apr 23, 2014, at 8:21 AM, Adrian Mocanu <amoc...@verticalscope.com> wrote: > How does the default spark partitioner partition RDD data? Does it keep the > data in order? > > I’m asking because I’m generating an RDD by hand via > `ssc.sparkContext.makeRDD(collection.toArray)` and I collect and iterate over > what I collect, but the data is in a different order than in the initial > collection from which the RDD comes from. > > -Adrian