lyy-pineapple opened a new pull request, #41723: URL: https://github.com/apache/spark/pull/41723
### What changes were proposed in this pull request? The dynamic scheduler listener keeps track of the number of running speculative tasks. When a task fails and is not a speculative task, and its corresponding speculative task is still running, the stageAttemptToTaskIndices will not remove the task index. ### Why are the changes needed? Assuming a stage has Task 1, with Task 1.0 and a speculative task Task 1.1 running concurrently, the dynamic scheduler calculates the number of executors as 2 (pendingTask: 0, pendingSpeculative: 0, running: 2). At this point, Task 1.0 fails, and the dynamic scheduler recalculates the number of executors as 2 (pendingTask: 1, pendingSpeculative: 0, running: 1). Due to the failure of Task 1.0, copyRunning(1) becomes 1. As a result, Task 1 is speculated again and a SparkListenerSpeculativeTaskSubmitted event is triggered. However, the dynamic scheduler's calculation for the number of executors becomes 3 (pendingTask: 1, pendingSpeculative: 1, running: 1), which is obviously not as expected. Then, Task 1.2 starts, and it is marked as a speculative task. However, the dynamic scheduler still calculates the number of executors as 3 (pendingTask: 1, pendingSpeculative: 1, running: 1), which again is not as expected. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? add ut -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
