tgravescs commented on code in PR #36716:
URL: https://github.com/apache/spark/pull/36716#discussion_r902710949
##########
core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala:
##########
@@ -336,9 +340,23 @@ object ResourceProfile extends Logging {
private def getDefaultExecutorResources(conf: SparkConf): Map[String,
ExecutorResourceRequest] = {
val ereqs = new ExecutorResourceRequests()
- val cores = conf.get(EXECUTOR_CORES)
- ereqs.cores(cores)
- val memory = conf.get(EXECUTOR_MEMORY)
+
+ val isStandalone =
conf.getOption("spark.master").exists(_.startsWith("spark://"))
+ val isLocalCluster =
conf.getOption("spark.master").exists(_.startsWith("local-cluster"))
+ // By default, standalone executors take all available cores, do not have
a specific value.
+ val cores = if (isStandalone || isLocalCluster) {
+ conf.getOption(EXECUTOR_CORES.key).map(_.toInt)
+ } else {
+ Some(conf.get(EXECUTOR_CORES))
+ }
+ cores.foreach(ereqs.cores)
+
+ // Setting all resources here, cluster managers will take the resources
they respect.
Review Comment:
I'm not sure what this comment means "cluster managers will take the
resources they respect". could you clarify
##########
core/src/main/scala/org/apache/spark/resource/ResourceProfileManager.scala:
##########
@@ -54,6 +54,7 @@ private[spark] class ResourceProfileManager(sparkConf:
SparkConf,
private val master = sparkConf.getOption("spark.master")
private val isYarn = master.isDefined && master.get.equals("yarn")
private val isK8s = master.isDefined && master.get.startsWith("k8s://")
+ private val isStandalone = master.isDefined &&
master.get.startsWith("spark://")
Review Comment:
what about local cluster here? it would be nice to update description if
adding support for it as well. Otherwise remove code from ResourceProfile
##########
core/src/test/scala/org/apache/spark/LocalSparkContext.scala:
##########
@@ -31,6 +31,8 @@ trait LocalSparkContext extends BeforeAndAfterEach with
BeforeAndAfterAll { self
override def beforeAll(): Unit = {
super.beforeAll()
+ // Clear default profile, just in case default profile is not cleaned in
other suites.
+ ResourceProfile.clearDefaultProfile()
Review Comment:
seems like this should be unnecessary, were you getting test issues with
this?
##########
core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala:
##########
@@ -402,7 +420,7 @@ object ResourceProfile extends Logging {
}
private[spark] case class ExecutorResourcesOrDefaults(
- cores: Int,
+ cores: Option[Int], // Can only be None for standalone cluster.
Review Comment:
standalone and local-cluster?
##########
docs/spark-standalone.md:
##########
@@ -455,6 +455,14 @@ if the worker has enough cores and memory. Otherwise, each
executor grabs all th
on the worker by default, in which case only one executor per application may
be launched on each
worker during one single schedule iteration.
+# Stage Level Scheduling Overview
+
+Stage level scheduling is supported on Standalone when dynamic allocation is
enabled. Currently, when Master allocates executors for one application, will
schedule based on the order of the ResourceProfile ids for multiple
ResourceProfiles. The ResourceProfile with smaller id will be scheduled
firstly. Normally this won’t matter as Spark finishes one stage before starting
another one, the only case this might have an affect is in a job server type
scenario, so its something to keep in mind. For scheduling, we will only take
executor memory and executor cores from built-in executor resources and all
other custom resources from a ResourceProfile, other built-in executor
resources such as offHeap and memoryOverhead won't take any effect. The base
default profile will be created based on the spark configs when you submit an
application. Executor memory and executor cores from the base default profile
can be propagated to custom ResourceProfiles, but all other custom resources
can not
be propagated.
Review Comment:
"when the Master"
"it will schedule"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]