[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

squito Mon, 27 Aug 2018 08:36:03 -0700

Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22112#discussion_r213009399
  
    --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
    @@ -33,6 +33,9 @@ import org.apache.spark.util.random.SamplingUtils
     /**
      * An object that defines how the elements in a key-value pair RDD are 
partitioned by key.
      * Maps each key to a partition ID, from 0 to `numPartitions - 1`.
    + *
    + * Note that, partitioner must be idempotent, i.e. it must return the same 
partition id given the
    --- End diff --
    
    I think you mean deterministic, not idempotent (which would mean that 
`partition(key) == partition(partition(key))`)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

Reply via email to