[GitHub] [spark] tgravescs commented on a change in pull request #26284: [SPARK-29415][Core]Stage Level Sched: Add base ResourceProfile and Request classes

GitBox Tue, 12 Nov 2019 09:45:25 -0800

tgravescs commented on a change in pull request #26284: 
[SPARK-29415][Core]Stage Level Sched: Add base ResourceProfile and Request 
classes
URL: https://github.com/apache/spark/pull/26284#discussion_r345350576


 ##########
 File path: 
core/src/main/scala/org/apache/spark/resource/ExecutorResourceRequest.scala
 ##########
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.resource
+
+/**
+ * An Executor resource request. This is used in conjunction with the 
ResourceProfile to
+ * programmatically specify the resources needed for an RDD that will be 
applied at the
+ * stage level.
+ *
+ * This is used to specify what the resource requirements are for an Executor 
and how
+ * Spark can find out specific details about those resources. Not all the 
parameters are
+ * required for every resource type. The resources names supported
+ * correspond to the regular Spark configs with the prefix removed. For 
instance overhead
+ * memory in this api is memoryOverhead, which is 
spark.executor.memoryOverhead with
+ * spark.executor removed. Resources like GPUs are resource.gpu
+ * (spark configs spark.executor.resource.gpu.*). The amount, discoveryScript, 
and vendor
+ * parameters for resources are all the same parameters a user would specify 
through the
+ * configs: spark.executor.resource.{resourceName}.{amount, discoveryScript, 
vendor}.
+ *
+ * For instance, a user wants to allocate an Executor with GPU resources on 
YARN. The user has
+ * to specify the resource name (resource.gpu), the amount or number of GPUs 
per Executor,
+ * units would not be used as its not a memory config, the discovery script 
would be specified
+ * so that when the Executor starts up it can discovery what GPU addresses are 
available for it to
+ * use because YARN doesn't tell Spark that, then vendor would not be used 
because
+ * its specific for Kubernetes.
+ *
+ * See the configuration and cluster specific docs for more details.
+ *
+ * There are alternative constructors for working with Java.
+ *
+ * @param resourceName Name of the resource
+ * @param amount Amount requesting
+ * @param units Optional units of the amount. For things like Memory, default 
is no units, only byte
+ *              types (b, mb, gb, etc) are currently supported.
 
 Review comment:
   enhanced version of my example using GPU resource:
   
   ```
   val ereqs = new ExecutorResourceRequests()
   ereqs.cores(2).memory(4096, "m")
   ereqs.memoryOverhead(2048, "m").pysparkMemory(1024, "m")
   **ereqs.resource("resource.gpu", 2, "discoveryScript")**
   val treqs = new TaskResourceRequests()
   treqs.cpus(1)
   **treqs.resource("resource.gpu", 1.0)**
   
   rprof.require(treqs)
   rprof.require(ereqs)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] tgravescs commented on a change in pull request #26284: [SPARK-29415][Core]Stage Level Sched: Add base ResourceProfile and Request classes

Reply via email to