cloud-fan opened a new pull request #24375: [SPARK-25250][CORE] try best to not 
submit tasks when the partitions are already completed
URL: https://github.com/apache/spark/pull/24375
 
 
   ## What changes were proposed in this pull request?
   
   #21131 firstly implements that a previous successful completed task from 
zombie `TaskSetManager` could also succeed in the active `TaskSetManager`. 
Later #23871 improves the implementation to cover a corner case that, an active 
`TaskSetManager` hasn't been created when a previous task succeed.
   
   However, #23871 has a bug and was reverted in #24359.
   
   Look back to the original problem, there are 2 findings:
   1. The issue cannot be 100% eliminated. Let's say task set 1.0 (zombie) has 
a running task for a partition, and task set 1.1 (active) has already submitted 
the task for the same partition and completed. Then there is nothing we can do.
   2. The thing we care about is the task completion events from a zombie task 
set. If a task from the active task set completes, we don't need to mark the 
corresponding tasks from zombie task sets as completed.
   
   This PR proposes a new fix:
   1. When `DAGScheduler` gets a task success event from an earlier attempt, 
notify the `TaskSchedulerImpl` about it
   2. When `TaskSchedulerImpl` knows a partition is already completed, ask the 
active `TaskSetManager` to mark the corresponding task as finished, if the task 
is not finished yet.
   
   ## How was this patch tested?
   
   a new test case.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to