[GitHub] spark pull request: SPARK-1937: fix issue with task locality

lirui-intel Tue, 10 Jun 2014 08:19:43 -0700

Github user lirui-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/892#discussion_r13599131
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -388,7 +386,7 @@ private[spark] class TaskSetManager(
           val curTime = clock.getTime()
     
           var allowedLocality = getAllowedLocalityLevel(curTime)
    -      if (allowedLocality > maxLocality) {
    +      if (allowedLocality > maxLocality && 
myLocalityLevels.contains(maxLocality)) {
             allowedLocality = maxLocality   // We're not allowed to search for 
farther-away tasks
           }
    --- End diff --
    
    @mridulm - Thanks for replying. In my opinion, however, relaxing the 
allowed locality won't change the scheduling order. NODE_LOCAL tasks (if any) 
still get scheduled before RACK_LOCAL ones. And if we allow RACK_LOCAL but get 
a NODE_LOCAL task, currentLocalityIndex will be updated so that next time we 
will use NODE_LOCAL as the constraint.
    However, if we restrict up to PROCESS_LOCAL while it's in fact not valid 
for the TaskSetManager, the NODE_LOCAL and RACK_LOCAL tasks will be skipped and 
we may end up picking tasks from pendingTasksWithNoPrefs.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1937: fix issue with task locality

Reply via email to