Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3779#issuecomment-72584682
  
    _(Stream-of-consciousness ahead, mostly for my own benefit as I think 
through the implications of this PR)_
    
    I'd like to understand whether this patch has any performance implications 
for Spark jobs in general.  Is there any scenario where this scheduling change 
might introduce performance regressions?
    
    I guess that a task's locality wait is a sort of hard deadline, where we 
won't consider scheduling a task at a lower locality level until at least that 
much time has elapsed.  This sounds like a per-task property, where the 
deadline for one task shouldn't apply / influence other tasks (e.g. treat each 
task independently), but the bug reported here sounds like we're applying the 
locality waits to sets of tasks.
    
    It seems like we always want to attempt to schedule tasks in decreasing 
order of locality, so if there are unscheduled process-local tasks then we 
should always attempt to schedule them before any tasks at lower locality 
levels.  In addition, if there are free slots in the cluster and there are 
unscheduled tasks in the scheduler's current locality level, then we should 
schedule those tasks.  If we've exhausted all tasks at a particular locality 
level, then it makes sense to immediately move onto scheduling at the next 
lower locality level.
    
    If we have tasks at some high locality level that cannot be scheduled at 
their preferred locality and there are tasks at a lower locality level that 
_can_ be scheduled, then I guess we might be concerned about whether scheduling 
the less-local tasks could rob tasks waiting at a higher locality level of 
their opportunity to run.  This shouldn't happen, though, since those tasks 
will already have been offered those resources and turned them down.
    
    Therefore, I think that this patch is a good fix: it doesn't make sense to 
let a single task with strong preferences to delay / block the scheduling of 
other tasks with weaker preferences / preferences for other resources.  I'll 
take a closer look at the code and tests now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to