Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/22112#discussion_r213010846
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1918,3 +1980,19 @@ object RDD {
new DoubleRDDFunctions(rdd.map(x => num.toDouble(x)))
}
}
+
+/**
+ * The random level of RDD's output (i.e. what `RDD#compute` returns),
which indicates how the
+ * output will diff when Spark reruns the tasks for the RDD. There are 3
random levels, ordered
+ * by the randomness from low to high:
+ * 1. IDEMPOTENT: The RDD output is always same (including order) when
rerun.
--- End diff --
here too, idempotent is the wrong word for this ... deteminstic?
partition-ordered? (I guess "ordered" could make it seem like the entire data
is ordered ...)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]