lyy-pineapple opened a new pull request, #41723:
URL: https://github.com/apache/spark/pull/41723

   
   ### What changes were proposed in this pull request?
   The dynamic scheduler listener keeps track of the number of running 
speculative tasks. When a task fails and is not a speculative task, and its 
corresponding speculative task is still running, the stageAttemptToTaskIndices 
will not remove the task index.
   
   
   
   ### Why are the changes needed?
   Assuming a stage has Task 1, with Task 1.0 and a speculative task Task 1.1 
running concurrently, the dynamic scheduler calculates the number of executors 
as 2 (pendingTask: 0, pendingSpeculative: 0, running: 2).
   
   At this point, Task 1.0 fails, and the dynamic scheduler recalculates the 
number of executors as 2 (pendingTask: 1, pendingSpeculative: 0, running: 1).
   
   Due to the failure of Task 1.0, copyRunning(1) becomes 1. As a result, Task 
1 is speculated again and a SparkListenerSpeculativeTaskSubmitted event is 
triggered. However, the dynamic scheduler's calculation for the number of 
executors becomes 3 (pendingTask: 1, pendingSpeculative: 1, running: 1), which 
is obviously not as expected.
   
   Then, Task 1.2 starts, and it is marked as a speculative task. However, the 
dynamic scheduler still calculates the number of executors as 3 (pendingTask: 
1, pendingSpeculative: 1, running: 1), which again is not as expected.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   add ut
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to