attilapiros commented on a change in pull request #26343: [SPARK-29683][YARN]
Job will fail due to executor failures all available nodes are blacklisted
URL: https://github.com/apache/spark/pull/26343#discussion_r369131483
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala
##########
@@ -103,7 +117,32 @@ private[spark] class YarnAllocatorBlacklistTracker(
refreshBlacklistedNodes()
}
- def isAllNodeBlacklisted: Boolean = currentBlacklistedYarnNodes.size >=
numClusterNodes
+ def isAllNodeBlacklisted: Boolean = {
+ refreshYarnNodes()
+ val allBlacklisted =
allRunningNodes.diff(currentBlacklistedYarnNodes).isEmpty
Review comment:
The `allRunningNodes` is not needed as a separate member `var`: you can do
the filtering from `allNodes` here.
This way you save its calculation at the `excludeNodes()` def where the
`allRunningNodes` is not needed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]