jiangxb1987 commented on a change in pull request #25751: [SPARK-29042][Core]
Sampling-based RDD with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25751#discussion_r323965139
##########
File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala
##########
@@ -870,6 +870,29 @@ abstract class RDD[T: ClassTag](
preservesPartitioning)
}
+ /**
+ * Return a new RDD by applying a function to each partition of this RDD,
while tracking the index
+ * of the original partition.
+ *
+ * `preservesPartitioning` indicates whether the input function preserves
the partitioner, which
+ * should be `false` unless this is a pair RDD and the input function
doesn't modify the keys.
+ *
+ * `isOrderSensitive` indicates whether the function is order-sensitive. If
it is order
+ * sensitive, it may return totally different result when the input order
+ * is changed. Mostly stateful functions are order-sensitive.
+ */
+ private[spark] def mapPartitionsWithIndex[U: ClassTag](
Review comment:
shall we expose this to users?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]