mridulm commented on a change in pull request #30204:
URL: https://github.com/apache/spark/pull/30204#discussion_r523135039
##########
File path: docs/running-on-yarn.md
##########
@@ -644,6 +644,7 @@ YARN does not tell Spark the addresses of the resources
allocated to each contai
# Stage Level Scheduling Overview
Stage level scheduling is supported on YARN when dynamic allocation is
enabled. One thing to note that is YARN specific is that each ResourceProfile
requires a different container priority on YARN. The mapping is simply the
ResourceProfile id becomes the priority, on YARN lower numbers are higher
priority. This means that profiles created earlier will have a higher priority
in YARN. Normally this won't matter as Spark finishes one stage before starting
another one, the only case this might have an affect is in a job server type
scenario, so its something to keep in mind.
+Note there is a difference in the way custom resources are handled between the
base default profile and custom ResourceProfiles. To allow for the user to
request YARN containers with extra resources without Spark scheduling on them,
the user can specify resources via the
<code>spark.yarn.executor.resource.</code> config. Those configs are only used
in the base default profile though and do not get propogated into any other
custom ResourceProfiles. This is because there would be no way to remove them
if you wanted a stage to not have them. This results in your default profile
getting custom resources defined in <code>spark.yarn.executor.resource.</code>
plus spark defined resources of GPU or FPGA. Spark converts GPU and FPGA
resources into the YARN built in types <code>yarn.io/gpu</code>) and
<code>yarn.io/fpga</code>, but does not know the mapping of any other
resources. Any other Spark custom resources are not propogated to YARN for the
default profile. So if you want Spark to sche
dule based off a custom resource and have it requested from YARN, you must
specify it in both YARN (<code>spark.yarn.{driver/executor}.resource.</code>)
and Spark (<code>spark.{driver/executor}.resource.</code>) configs. Leave the
Spark config off if you only want YARN containers with the extra resources but
Spark not to schedule using them. Now for custom ResourceProfiles, it doesn't
currently have a way to only specify YARN resources without Spark scheduling
off of them. This means for custom ResourceProfiles we propogate all the
resources defined in the ResourceProfile to YARN. We still convert GPU and FPGA
to the YARN build in types as well. This requires that the name of any custom
resources you specify match what they are defined as in YARN.
Review comment:
No immediate use case; but I was looking at the possibilities for using
resource profiles for composite jobs doing both data prep and DL in single app,
and ability to leverage either a queue with GPU resources or specify node
labels for a resource profile would help.
If it is already tracked in a separate jira, that is fine ! Will wait for
that to be merged :-)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]