Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1313#discussion_r15508258
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -113,6 +114,10 @@ private[spark] class TaskSetManager(
       // but at host level.
       private val pendingTasksForHost = new HashMap[String, ArrayBuffer[Int]]
     
    +  // this collection is mainly for ensuring that the NODE_LOCAL task is 
always scheduled
    +  // before NOPREF and it contain all NODE_LOCAL and "not-launched" tasks
    +  private[scheduler] val nodeLocalTasks = new HashMap[String, HashSet[Int]]
    --- End diff --
    
    Instead of doing this on a per-node basis, how about just keeping a flag 
called `hadNodeOnlyTasks`? I think that will be enough to cover the case we're 
worried about; we don't need to track it for each host and have different 
delays for each host.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to