tgravescs commented on a change in pull request #28085:
[SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
URL: https://github.com/apache/spark/pull/28085#discussion_r404167508
##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -1135,6 +1137,22 @@ private[spark] class DAGScheduler(
}
}
+ /**
+ * PythonRunner needs to know what the pyspark memory setting is for the
profile being run.
+ * Pass it in the local properties of the task if it's set for the stage
profile.
+ */
+ private def addPysparkMemToProperties(stage: Stage, properties: Properties):
Unit = {
+ val pysparkMem = if (stage.resourceProfileId ==
DEFAULT_RESOURCE_PROFILE_ID) {
+ logDebug("Using the default pyspark executor memory")
+ sc.conf.get(PYSPARK_EXECUTOR_MEMORY)
+ } else {
+ val rp =
sc.resourceProfileManager.resourceProfileFromId(stage.resourceProfileId)
+ logDebug(s"Using profile ${stage.resourceProfileId} pyspark executor
memory")
+ rp.getPysparkMemory
+ }
Review comment:
no, the ResourceProfiles doesn't have the SparkConf. I could make a utility
function that takes the conf as a parameter and move it into there. I actually
thought about that but it requires a bunch of parameters or still doing some
logic here because it either uses the profile id, or it uses the profiles
itself. This just seemed a bit cleaner here, but I can move it if you have a
strong opinion
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]