Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/17533#discussion_r110434977
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -1139,6 +1138,19 @@ class TaskSetManagerSuite extends SparkFunSuite with
LocalSparkContext with Logg
.updateBlacklistForFailedTask(anyString(), anyString(), anyInt())
}
+ test("Schedule tasks based on size of input from ShuffledRDD.") {
+ sc = new SparkContext("local", "test")
+ sched = new FakeTaskScheduler(sc)
+ val taskSet = FakeTask.createTaskSet(4)
+ val clock = new ManualClock()
+ val manager = new TaskSetManager(sched, taskSet, 1, clock = clock)
+ manager.setTaskInputSizeFromShuffledRDD(taskSet.tasks.zip(Seq(1L,
100L, 10000L, 1000L)).toMap)
+ assert(manager.resourceOffer("exec", "host", ANY).get.index === 2)
+ assert(manager.resourceOffer("exec", "host", ANY).get.index === 3)
+ assert(manager.resourceOffer("exec", "host", ANY).get.index === 1)
+ assert(manager.resourceOffer("exec", "host", ANY).get.index === 0)
+ }
--- End diff --
we'd also want a test to make sure the sizes were getting computed
correctly. (I think that might be easier to do with the refactor I suggested?)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]