tgravescs commented on a change in pull request #26284: [SPARK-29415][Core]Stage Level Sched: Add base ResourceProfile and Request classes URL: https://github.com/apache/spark/pull/26284#discussion_r345350576
########## File path: core/src/main/scala/org/apache/spark/resource/ExecutorResourceRequest.scala ########## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.resource + +/** + * An Executor resource request. This is used in conjunction with the ResourceProfile to + * programmatically specify the resources needed for an RDD that will be applied at the + * stage level. + * + * This is used to specify what the resource requirements are for an Executor and how + * Spark can find out specific details about those resources. Not all the parameters are + * required for every resource type. The resources names supported + * correspond to the regular Spark configs with the prefix removed. For instance overhead + * memory in this api is memoryOverhead, which is spark.executor.memoryOverhead with + * spark.executor removed. Resources like GPUs are resource.gpu + * (spark configs spark.executor.resource.gpu.*). The amount, discoveryScript, and vendor + * parameters for resources are all the same parameters a user would specify through the + * configs: spark.executor.resource.{resourceName}.{amount, discoveryScript, vendor}. + * + * For instance, a user wants to allocate an Executor with GPU resources on YARN. The user has + * to specify the resource name (resource.gpu), the amount or number of GPUs per Executor, + * units would not be used as its not a memory config, the discovery script would be specified + * so that when the Executor starts up it can discovery what GPU addresses are available for it to + * use because YARN doesn't tell Spark that, then vendor would not be used because + * its specific for Kubernetes. + * + * See the configuration and cluster specific docs for more details. + * + * There are alternative constructors for working with Java. + * + * @param resourceName Name of the resource + * @param amount Amount requesting + * @param units Optional units of the amount. For things like Memory, default is no units, only byte + * types (b, mb, gb, etc) are currently supported. Review comment: enhanced version of my example using GPU resource: ``` val ereqs = new ExecutorResourceRequests() ereqs.cores(2).memory(4096, "m") ereqs.memoryOverhead(2048, "m").pysparkMemory(1024, "m") **ereqs.resource("resource.gpu", 2, "discoveryScript")** val treqs = new TaskResourceRequests() treqs.cpus(1) **treqs.resource("resource.gpu", 1.0)** rprof.require(treqs) rprof.require(ereqs) ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
