Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21698#discussion_r200049016
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -461,9 +464,12 @@ abstract class RDD[T: ClassTag](
} : Iterator[(Int, T)]
// include a shuffle step so that our upstream tasks are still
distributed
+ val recomputeOnFailure =
+
conf.getBoolean("spark.shuffle.recomputeAllPartitionsOnRepartitionFailure",
true)
--- End diff --
without sorting, it doesn't make sense to have this config: disabling it
means users will get wrong result.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]