tgravescs commented on a change in pull request #26682: [SPARK-29306][CORE]
Stage Level Sched: Executors need to track what ResourceProfile they are
created with
URL: https://github.com/apache/spark/pull/26682#discussion_r366907535
##########
File path: core/src/main/scala/org/apache/spark/resource/ResourceUtils.scala
##########
@@ -232,6 +274,37 @@ private[spark] object ResourceUtils extends Logging {
resourceInfoMap
}
+ /**
+ * This function is similar to getOrDiscoverallResources, except for it uses
the ResourceProfile
+ * information instead of the application level configs.
+ *
+ * It first looks to see if resource were explicitly specified in the
resources file
+ * (this would include specified address assignments and it only specified
in certain
+ * cluster managers) and then it looks at the ResourceProfile to get
Review comment:
The ResourceProfile defines what resources the executor is supposed to have
- the requirements. 4 cores, 1 GPU, 1 FPGA, etc.
The resource file is only used in standalone mode where the Worker is
responsible for assigning resources to the executors. The discovery script is
used for any resources that aren't specified in the resource file. You can
actually mix them if you want - for instance in standalone mode lets say your
workers know about GPUs, but not FPGA's. The worker woudl start your executor
passing a resources file with the GPU assignments, but then the executor would
have to run the discovery script to find what FPGA's are available to it.
This PR isn't really changing how that is done, that is from the original
accelerator aware scheduling work. This PR only changes where it gets the
requirements from. Now it gets it from the ResourceProfile rather then the
spark configs.
So to clarify a bit more Spark is relying on the cluster manager to assign
containers with the resource you request. In the case of Standalone mode the
Worker starts the executor assigning it resources via the resources file. In
the case of yarn and k8s, they just give you a container and don't tell you
what resources are on it, so we use the discovery Script to find the addresses
of those resources yarn/k8s gave you.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]