Github user mridulm commented on the issue: https://github.com/apache/spark/pull/22112 @tgravescs I was specifically in agreement with > Personally I don't want to talk about implementation until we decide what we want our semantics to be around the unordered operations because that affects any implementation. and > I would propose we fix the things that are using the round robin type partitioning (repartition) but then unordered things like zip/MapPartitions (via user code) we document or perhaps give the user the option to sort. IMO a fix in spark core for repartition should work for most (if not all) order dependent closures - we might choose not to implement for others due to time constraints; but basic idea should be fairly similar. Given this, I am fine with documenting the potential issue for others and fix for a core subset - with assumption that we will expand solution to cover all later.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org