tgravescs commented on code in PR #40730:
URL: https://github.com/apache/spark/pull/40730#discussion_r1165733519


##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -801,7 +810,11 @@ private[spark] class TaskSchedulerImpl(
    * overriding in tests, so it can be deterministic.

Review Comment:
   comment is now wrong and should be updated to say what bin packing does



##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -401,17 +403,24 @@ private[spark] class TaskSchedulerImpl(
       val host = shuffledOffers(i).host
       val taskSetRpID = taskSet.taskSet.resourceProfileId
 
+      var continueScheduling = true
       // check whether the task can be scheduled to the executor base on 
resource profile.
-      if (sc.resourceProfileManager
-        .canBeScheduled(taskSetRpID, shuffledOffers(i).resourceProfileId)) {
+      while (sc.resourceProfileManager

Review Comment:
   I'm not sure I like this coding approach of just short circuiting here and 
bypassing everything of the caller resourceOffers function.
   
   I think this approach is going to superceded locality for instance. In the 
very least we need to define that behavior.  Was your intention to override 
locality?
   there are other obvious things like the resourceOffers comment is now wrong 
because it mentions round robin and this is overriding that.
   
   I'm also curious if this has other side affects like potentially FAIR 
Scheduler. its something I haven't looked at in a long time so need to spend 
more time on it.



##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -2051,6 +2051,14 @@ package object config {
       .timeConf(TimeUnit.MILLISECONDS)
       .createOptional
 
+  private[spark] val BIN_PACK_ENABLED =
+    ConfigBuilder("spark.scheduler.binPack.enabled")
+      .doc(s"Whether to enable bin packing task scheduling on executors. This 
could help save" +
+        s" resource when ${DYN_ALLOCATION_ENABLED.key} is enabled.")

Review Comment:
   it might be nice to put what spark does by default - round robin



##########
core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala:
##########
@@ -193,6 +193,21 @@ class TaskSchedulerImplSuite extends SparkFunSuite with 
LocalSparkContext
     assert(!failedTaskSet)
   }
 
+  test("SPARK-43086: Scheduler should schedule task on fewest executors" +

Review Comment:
   I would like to see more tests for how this affects other things like 
locality, fair scheduler, etc..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to