Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/22112#discussion_r213009399
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -33,6 +33,9 @@ import org.apache.spark.util.random.SamplingUtils
/**
* An object that defines how the elements in a key-value pair RDD are
partitioned by key.
* Maps each key to a partition ID, from 0 to `numPartitions - 1`.
+ *
+ * Note that, partitioner must be idempotent, i.e. it must return the same
partition id given the
--- End diff --
I think you mean deterministic, not idempotent (which would mean that
`partition(key) == partition(partition(key))`)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]