agrawaldevesh commented on a change in pull request #29014:
URL: https://github.com/apache/spark/pull/29014#discussion_r456736612
##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -1767,8 +1767,13 @@ private[spark] class DAGScheduler(
// TODO: mark the executor as failed only if there were lots of
fetch failures on it
if (bmAddress != null) {
- val hostToUnregisterOutputs = if
(env.blockManager.externalShuffleServiceEnabled &&
- unRegisterOutputOnHostOnFetchFailure) {
+ val externalShuffleServiceEnabled =
env.blockManager.externalShuffleServiceEnabled
+ val isHostDecommissioned = taskScheduler
+ .getExecutorDecommissionInfo(bmAddress.executorId)
+ .exists(_.isHostDecommissioned)
Review comment:
`isShuffleLost` is not very applicable here. Actually that method was
too narrow in scope so I inlined it. It strictly means that shuffle is lost for
this executor.
In this context, we already know that shuffle is lost for the executor: We
are simply trying to determine if it is also lost for the entire host. I
updated the code to reflect the logic/intent better.
I will create a follow up Jira under the master ticket to track changing
this logic when "Local Fetch" is merged in.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]