Here¹s my understanding of row order guarantees by RDD in the context of limit() and collect(). Can someone confirm this? * sparkContext.parallelize(myList) returns an RDD that may have a different row order than myList. * Every RDD loaded with the same file in HDFS (e.g. sparkContext.textFile(³hdfs://path_to_file²)) will collect rows in the same order. * Row order of an RDD is preserved through non-shuffling operations (e.g. Map, filter). Mingyu
smime.p7s
Description: S/MIME cryptographic signature
