Ngone51 commented on a change in pull request #23223:
[SPARK-26269][YARN]Yarnallocator should have same blacklist behaviour with yarn
to maxmize use of cluster resource
URL: https://github.com/apache/spark/pull/23223#discussion_r243329624
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
##########
@@ -612,13 +612,23 @@ private[yarn] class YarnAllocator(
val message = "Container killed by YARN for exceeding physical
memory limits. " +
s"$diag Consider boosting ${EXECUTOR_MEMORY_OVERHEAD.key}."
(true, message)
- case _ =>
- // all the failures which not covered above, like:
- // disk failure, kill by app master or resource manager, ...
- allocatorBlacklistTracker.handleResourceAllocationFailure(hostOpt)
- (true, "Container marked as failed: " + containerId + onHostStr +
- ". Exit status: " + completedContainer.getExitStatus +
- ". Diagnostics: " + completedContainer.getDiagnostics)
+ case other_exit_status =>
+ // SPARK-26269: follow YARN's blacklisting behaviour(see
https://github
+ //
.com/apache/hadoop/blob/228156cfd1b474988bc4fedfbf7edddc87db41e3/had
+ //
oop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/ap
+ // ache/hadoop/yarn/util/Apps.java#L273 for details)
+ if
(NOT_APP_AND_SYSTEM_FAULT_EXIT_STATUS.contains(other_exit_status)) {
+ (true, s"Container marked as failed: $containerId$onHostStr" +
Review comment:
Make sence.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]