mridulm commented on code in PR #38711:
URL: https://github.com/apache/spark/pull/38711#discussion_r1038740740


##########
core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala:
##########
@@ -383,8 +383,8 @@ private[spark] class DAGScheduler(
   /**
    * Called by the TaskSetManager when it decides a speculative task is needed.
    */
-  def speculativeTaskSubmitted(task: Task[_]): Unit = {
-    eventProcessLoop.post(SpeculativeTaskSubmitted(task))
+  def speculativeTaskSubmitted(task: Task[_], taskIndex: Int = -1): Unit = {
+    eventProcessLoop.post(SpeculativeTaskSubmitted(task, taskIndex))

Review Comment:
   We are trying to identify if two tasks for a stage attempt are computing the 
same partition or not.
   Whether this is using partitionId or taskIndex, they are effectively the 
same - yes, the values wont match when a subset of partitions are computed in a 
stage attempt - but that is expected.
   
   Having said that, my intention was to minimize change to external api - not 
necessarily internal impl.
   Looks like we are not able to do that ... any thoughts @Ngone51 ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to