Hi all,

Is the sort order guaranteed if you apply operations like map(), filter() or
distinct() after sort in a distributed setting (run on a cluster of machines
backed by HDFS)? In other words, does rdd.sortByKey().map() have the same
sort order as rdd.sortByKey()? If so, is it documented somewhere which
operations preserve sort order and which don't?

Thanks,
Mingyu


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to