[GitHub] [spark] Ngone51 commented on pull request #29579: [SPARK-32736][CORE] Avoid caching the removed decommissioned executors in TaskSchedulerImpl

GitBox Sun, 30 Aug 2020 20:52:39 -0700


Ngone51 commented on pull request #29579:
URL: https://github.com/apache/spark/pull/29579#issuecomment-683534969



   Thank you for the quick response @agrawaldevesh  .
   
   > However, with this PR, it seems you are removing the "clear shuffle on 
fetch failure" part. It seems that you will wait for the heartbeat failure to 
occur and the host be lost, even if the downstream has signaled fetch failure. 
   
   I think this PR doesn't change the semantics. We still clear shuffle status 
on fetch failure as you can see the only change for fetch failure in 
DAGScheduler is:
   
   ```java
   -  .exists(_.isHostDecommissioned)
   +  .exists(_.hostOpt.isDefined)
   ```
   
   It the fetch failure comes first before the executor lost, DAGScheduler will 
still ask TaskSchedulerImpl for the decommission state and unregister the 
shuffle status then. While if the executor lost comes first, fetch failure 
becomes a NoOp on shuffle status unregister.
   
   I think the only difference is that, before this PR, if the executor lost 
event comes first, it can only unregister shuffle map status on that executor, 
even if we know the host is also decommissioned. But now we can unregister the 
host shuffle status because we pass in the host info directly.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Ngone51 commented on pull request #29579: [SPARK-32736][CORE] Avoid caching the removed decommissioned executors in TaskSchedulerImpl

Reply via email to