vanzin commented on a change in pull request #26218: [SPARK-29562][sql] Speed 
up and slim down metric aggregation in SQL listener.
URL: https://github.com/apache/spark/pull/26218#discussion_r338837803
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala
 ##########
 @@ -208,43 +246,13 @@ class SQLAppStatusListener(
       stageId: Int,
       attemptId: Int,
       taskId: Long,
+      partIdx: Int,
 
 Review comment:
   So this is what happens.
   
   When you don't have speculative tasks, then that comment in `TaskInfo` is 
true; in that it's the task's index in the task set.
   
   When you add speculative tasks, that becomes false; the new task will have 
the same index as some other task in the same task set. Speculation might have 
been added after `TaskInfo` and the comment never updated.
   
   So the gist is that this `index` is basically a proxy for the partition 
index. Two tasks with the same index are computing the same partition. So I 
think it's more important to track the intent (that this field refers, even if 
indirectly, to what partition is being calculated) than trying to be exact 
about its meaning.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to