tgravescs commented on code in PR #40730:
URL: https://github.com/apache/spark/pull/40730#discussion_r1165733519
##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -801,7 +810,11 @@ private[spark] class TaskSchedulerImpl(
* overriding in tests, so it can be deterministic.
Review Comment:
comment is now wrong and should be updated to say what bin packing does
##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -401,17 +403,24 @@ private[spark] class TaskSchedulerImpl(
val host = shuffledOffers(i).host
val taskSetRpID = taskSet.taskSet.resourceProfileId
+ var continueScheduling = true
// check whether the task can be scheduled to the executor base on
resource profile.
- if (sc.resourceProfileManager
- .canBeScheduled(taskSetRpID, shuffledOffers(i).resourceProfileId)) {
+ while (sc.resourceProfileManager
Review Comment:
I'm not sure I like this coding approach of just short circuiting here and
bypassing everything of the caller resourceOffers function.
I think this approach is going to superceded locality for instance. In the
very least we need to define that behavior. Was your intention to override
locality?
there are other obvious things like the resourceOffers comment is now wrong
because it mentions round robin and this is overriding that.
I'm also curious if this has other side affects like potentially FAIR
Scheduler. its something I haven't looked at in a long time so need to spend
more time on it.
##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -2051,6 +2051,14 @@ package object config {
.timeConf(TimeUnit.MILLISECONDS)
.createOptional
+ private[spark] val BIN_PACK_ENABLED =
+ ConfigBuilder("spark.scheduler.binPack.enabled")
+ .doc(s"Whether to enable bin packing task scheduling on executors. This
could help save" +
+ s" resource when ${DYN_ALLOCATION_ENABLED.key} is enabled.")
Review Comment:
it might be nice to put what spark does by default - round robin
##########
core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala:
##########
@@ -193,6 +193,21 @@ class TaskSchedulerImplSuite extends SparkFunSuite with
LocalSparkContext
assert(!failedTaskSet)
}
+ test("SPARK-43086: Scheduler should schedule task on fewest executors" +
Review Comment:
I would like to see more tests for how this affects other things like
locality, fair scheduler, etc..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]