Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/22112#discussion_r212386645
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1865,6 +1876,39 @@ abstract class RDD[T: ClassTag](
// RDD chain.
@transient protected lazy val isBarrier_ : Boolean =
dependencies.filter(!_.isInstanceOf[ShuffleDependency[_, _,
_]]).exists(_.rdd.isBarrier())
+
+ /**
+ * Returns the random level of this RDD's computing function. Please
refer to [[RDD.RandomLevel]]
+ * for the definition of random level.
+ *
+ * By default, an RDD without parents(root RDD) is IDEMPOTENT. For RDDs
with parents, the random
+ * level of current RDD is the random level of the parent which is
random most.
+ */
+ // TODO: make it public so users can set random level to their custom
RDDs.
+ // TODO: this can be per-partition. e.g. UnionRDD can have different
random level for different
+ // partitions.
+ private[spark] def computingRandomLevel: RDD.RandomLevel.Value = {
+ val parentRandomLevels = dependencies.map {
+ case dep: ShuffleDependency[_, _, _] =>
+ if (dep.rdd.computingRandomLevel == RDD.RandomLevel.INDETERMINATE)
{
+ RDD.RandomLevel.INDETERMINATE
--- End diff --
Crap, brain fart ... you are right it is UNORDERED and not INDETERMINATE
... I am still getting my head around the terms :-(
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]