mridulm commented on code in PR #38711: URL: https://github.com/apache/spark/pull/38711#discussion_r1038740740
########## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ########## @@ -383,8 +383,8 @@ private[spark] class DAGScheduler( /** * Called by the TaskSetManager when it decides a speculative task is needed. */ - def speculativeTaskSubmitted(task: Task[_]): Unit = { - eventProcessLoop.post(SpeculativeTaskSubmitted(task)) + def speculativeTaskSubmitted(task: Task[_], taskIndex: Int = -1): Unit = { + eventProcessLoop.post(SpeculativeTaskSubmitted(task, taskIndex)) Review Comment: We are trying to identify if two tasks for a stage attempt are computing the same partition or not. Whether this is using partitionId or taskIndex, they are effectively the same - yes, the values wont match when a subset of partitions are computed in a stage attempt - but that is expected. Having said that, my intention was to minimize change to external api - not necessarily internal impl. Looks like we are not able to do that ... any thoughts @Ngone51 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org