Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16354#discussion_r93292503
  
    --- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
    @@ -549,11 +546,15 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
           Seq(TaskLocation("host1", "execB")),
           Seq(TaskLocation("host2", "execC")),
           Seq())
    -    val manager = new TaskSetManager(sched, taskSet, 1, clock = new 
ManualClock)
    +    val clock = new ManualClock()
    +    val manager = new TaskSetManager(sched, taskSet, 1, clock = clock)
         sched.addExecutor("execA", "host1")
         manager.executorAdded()
         sched.addExecutor("execC", "host2")
         manager.executorAdded()
    +    // need one resource offer which doesn't schedule anything to start 
delay scheduling timer
    +    assert(manager.resourceOffer("exec1", "host1", ANY).isEmpty)
    +    clock.advance(sc.getConf.getTimeAsMs("spark.locality.wait", "3s") * 4)
         assert(manager.resourceOffer("exec1", "host1", ANY).isDefined)
    --- End diff --
    
    this used to work before the wait by a rather convoluted path.  When the 
task set is created, there aren't any executors available that meet locality 
constraints, so the allowed locality levels are `NO_PREF, ANY`.  After adding 
`("execA", "host1")` ,the locality levels are changed to 
`PROCESS_LOCAL,NODE_LOCAL,NO_PREF,ANY`.  However, we leave our current locality 
level at `NO_PREF` in `recomputeLocality()`, even though a tighter level is now 
available.
    
    This offer would fail to schedule anything if either:
    1) we have an intervening offer on "execA", "host1" (which would match at 
`PROCESS_LOCAL`, thus drop the locality level down)
    or
    2) we move `sched.addExecutor("execA", "host1")`  to before we create task 
set manager, in which case the initial locality levels would include 
`PROCESS_LOCAL`.
    
    I really don't think the old behavior was desirable (even if we don't make 
the other changes proposed here).  I'm also inclined to do more cleanup to this 
test -- best case, the naming of execs makes this unnecessarily confusing 
("execA" is local while "exec1" is not).
    
    I've added some logging to the original behavior make this a bit more clear 
here, if its helpful to look at: 
https://github.com/squito/spark/tree/tsm_unrelated_debug


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to