[GitHub] [spark] mridulm commented on a change in pull request #27583: [SPARK-29149][YARN] Update YARN cluster manager For Stage Level Scheduling

GitBox Mon, 24 Feb 2020 21:16:59 -0800

mridulm commented on a change in pull request #27583: [SPARK-29149][YARN]  
Update YARN cluster manager For Stage Level Scheduling
URL: https://github.com/apache/spark/pull/27583#discussion_r383662626


 ##########
 File path: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceRequestHelper.scala
 ##########
 @@ -227,6 +227,17 @@ private object ResourceRequestHelper extends Logging {
     resourceInformation
   }
 
+  def isYarnCustomResourcesNonEmpty(resource: Resource): Boolean = {
+    try {
+      // Use reflection as this uses APIs only available in Hadoop 3
 
 Review comment:
   
   Thanks for clarifying the behavior when YARN does support GPU, etc as a 
resource.
   
   I am probably missing something here, would be great to understand this 
better when YARN does not.
   Suppose I have a spark application, depending on some library which requires 
GPU (for example) and set corresponding resource profile expectations on the 
RDD's created (I am trying to make a case where app developer did not 
explicitly configure the resource profiles, but is implicitly leveraging them 
via some library).
   
   Now, if this application gets run on hadoop 2.7 (or anything before 2.10 as 
you mentioned), what will be the behavior ?
   If I understood it right :
   1) We will make requests to YARN without GPU's in the allocation request 
since YARN does not support it.
   2) On the nodes received, we will try to use the discovery script in 
assumption that GPU's are available - YARN is just oblivious about them. We 
will probably be using node-label constraint to ensure GPU availability ?
   3) If there are GPU's detected, we use them - else executor fails ?
   
   Is this right?
   If yes, how do we handle multi-tenancy on the executor host ? or choose 
which gpu(s) to use ?
   Is the assumption that in workloads like this, the entire node is reserved 
to prevent contention ? I am not sure if you have documented/detailed this 
somewhere and I missed it !

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] mridulm commented on a change in pull request #27583: [SPARK-29149][YARN] Update YARN cluster manager For Stage Level Scheduling

Reply via email to