seayoun commented on issue #26975: [SPARK-26975][CORE] Stage retry and executor 
crash cause app hung up forever
URL: https://github.com/apache/spark/pull/26975#issuecomment-568658757
 
 
   > I'm not sure killing tasks can work. There is no guarantee that a task can 
always be killed successfully. And even if we can, we may send out the kill 
request, and immediately get the executor lost event before the task is killed.
   > 
   > I think we should accept the fact that a running task may be useless as 
its corresponding partition is completed, and deal with it well. e.g. when 
seeing executor lost, don't reschedule tasks whose corresponding partitions are 
already completed.
   
   I think it doesn't matter, if driver  immediately get the executor lost 
event before the task is killed, the TSM will  `handleFailedTask` and will not 
scheduler it;
   Btw, app process the task success or failed status in `handleSuccessfulTask` 
or `handleFailedTask` if the task finished before killed; In 
`handleSuccessfulTask`, we mark it as `Killed(another stage succeeded)`, in 
`handleFailedTask`, we will not reschedule the task.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to