tgravescs commented on a change in pull request #27583: [SPARK-29149][YARN]
Update YARN cluster manager For Stage Level Scheduling
URL: https://github.com/apache/spark/pull/27583#discussion_r383923052
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceRequestHelper.scala
##########
@@ -227,6 +227,17 @@ private object ResourceRequestHelper extends Logging {
resourceInformation
}
+ def isYarnCustomResourcesNonEmpty(resource: Resource): Boolean = {
+ try {
+ // Use reflection as this uses APIs only available in Hadoop 3
Review comment:
You are correct on the behavior. Many companies requested for this to work
with their existing Hadoop installs (2.x where its < 2.10 or 3.1.1) and use the
methods they are using with hadoop 2. I'm not trying to create a solution for
everyone, just allow their existing solutions to work.
In most cases I've heard they have like a GPU queue or node labels so they
know they run on nodes with GPUs. After that different companies have different
ways of doing the multi-tenancy. I've heard of some using file locking for
instance. Or you could also put the GPUs in process exclusive mode and then
just iterate over them to acquire a free one. The idea here is they can use
whatever solution they already have. They can write a custom discovery script
and I also added the ability to plugin a class if its easier to write Java code
to do this. https://issues.apache.org/jira/browse/SPARK-30689?filter=-2
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]