Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/11105#discussion_r55444458
--- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala ---
@@ -184,4 +184,13 @@ abstract class TaskContext extends Serializable {
*/
private[spark] def registerAccumulator(a: Accumulable[_, _]): Unit
+ /**
+ * Set the current RDD and partition being processed.
+ */
+ private[spark] def setRDDPartitionInfo(rddId: Int, index: Int): Unit
+
+ /**
+ * Returns the current RDD and Partition ids being processed. May be
null.
+ */
+ private[spark] def getRDDPartitionInfo(): (Int, Int)
--- End diff --
why would this be null?
I'd also add a comment that this can be used to track whether an rdd has
been computed previously -- if it gets recomputed for any reason, the ids will
be the same. (there are so many different types of ids running around I think
its easy to get confused.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]