[GitHub] spark pull request #21698: [SPARK-23243][Core] Fix RDD.repartition() data co...

cloud-fan Wed, 04 Jul 2018 01:36:01 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21698#discussion_r200049016
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
    @@ -461,9 +464,12 @@ abstract class RDD[T: ClassTag](
           } : Iterator[(Int, T)]
     
           // include a shuffle step so that our upstream tasks are still 
distributed
    +      val recomputeOnFailure =
    +        
conf.getBoolean("spark.shuffle.recomputeAllPartitionsOnRepartitionFailure", 
true)
    --- End diff --
    
    without sorting, it doesn't make sense to have this config: disabling it 
means users will get wrong result.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21698: [SPARK-23243][Core] Fix RDD.repartition() data co...

Reply via email to