tgravescs commented on issue #24497: [SPARK-27630][CORE]Stage retry causes 
totalRunningTasks calculation to be negative
URL: https://github.com/apache/spark/pull/24497#issuecomment-491441977
 
 
   So the issue why its negative is the task end event comes after the new 
stage attempt has started which causes it to decrement the 
stageIdToNumRunningTask(stage) Map.  Yeah I'm surprised we didn't see that more 
but it would be timing dependent.
   What @squito says makes sense to me. 
   I was just looking a bit and I wonder if we have an issues with the 
stageIdToSpeculativeTaskIndices as well.  If the stageId gets put back in 
there, you could have an issue though it seems unlikely.  I wonder if it makes 
senes to look at using the stage attempt id for a few of these.   I think you 
could have similar problem with stageIdToTaskIndices if the stage attempt had 
started other tasks before you got the task end.  I would have to take a more 
thorough look to verify.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to