tgravescs commented on a change in pull request #26284: [SPARK-29415][Core]Stage Level Sched: Add base ResourceProfile and Request classes URL: https://github.com/apache/spark/pull/26284#discussion_r343830288
########## File path: core/src/main/scala/org/apache/spark/resource/ExecutorResourceRequest.scala ########## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.resource + +/** + * An Executor resource request. This is used in conjunction with the ResourceProfile to + * programmatically specify the resources needed for an RDD that will be applied at the + * stage level. + * + * This is used to specify what the resource requirements are for an Executor and how + * Spark can find out specific details about those resources. Not all the parameters are + * required for every resource type. The resources names supported + * correspond to the regular Spark configs with the prefix removed. For instance overhead + * memory in this api is memoryOverhead, which is spark.executor.memoryOverhead with + * spark.executor removed. Resources like GPUs are resource.gpu + * (spark configs spark.executor.resource.gpu.*). The amount, discoveryScript, and vendor + * parameters for resources are all the same parameters a user would specify through the + * configs: spark.executor.resource.{resourceName}.{amount, discoveryScript, vendor}. + * + * For instance, a user wants to allocate an Executor with GPU resources on YARN. The user has + * to specify the resource name (resource.gpu), the amount or number of GPUs per Executor, + * units would not be used as its not a memory config, the discovery script would be specified + * so that when the Executor starts up it can discovery what GPU addresses are available for it to + * use because YARN doesn't tell Spark that, then vendor would not be used because + * its specific for Kubernetes. + * + * See the configuration and cluster specific docs for more details. + * + * There are alternative constructors for working with Java. + * + * @param resourceName Name of the resource + * @param amount Amount requesting + * @param units Optional units of the amount. For things like Memory, default is no units, only byte + * types (b, mb, gb, etc) are currently supported. Review comment: yeah the code that needs to use the units isn't in this pr and I agree that could get ugly to compare units. This started out just being modeled after the YARN api. How complex that gets somewhat depends on if when comparing ResourceProfiles, the units parameter has to match. For instance if I specify memory as 1024m and then I compare to a profile with 1g, do we consider that the same? If we don't care about that we can just normalize the memory to be bytes or MB internally. Alternatively we could get rid of the units and just say memory amount must be specified in MB (similar to your suggestion ExecutorResourceRequest.memoryGB(amount: Double)) , but that doesn't match the way we do the other memory configs. I'm not sure how I feel about the api being through the constructor for everything except for memory. I can see that being confusing to the user as well. We could remove all the parameters from the constructor and make setters for all the things we support - memory, overheadMemory, core, resources (like GPUs), etc. This would remove the need for the Enum/String checks to see if we support it. Really at that point you could put directly in ResourceProfile and not need ExecutorResourceRequest. If we don't like any of those options, I could remove the units parameter for now from this PR and can add it back in with the PR that uses it. Which will be the scheduler code for merging and scheduling tasks and the cluster manager code. thoughts? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
