Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/892#discussion_r13601836 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -388,7 +386,7 @@ private[spark] class TaskSetManager( val curTime = clock.getTime() var allowedLocality = getAllowedLocalityLevel(curTime) - if (allowedLocality > maxLocality) { + if (allowedLocality > maxLocality && myLocalityLevels.contains(maxLocality)) { allowedLocality = maxLocality // We're not allowed to search for farther-away tasks } --- End diff -- Back at desktop, so can elaborate better. The issue is, with relaxed constraint for the executor, some task might get scheduled to it - which would have been better scheduled at some other executor. Simple scenario extending my earlier example, suppose there is only one task t1 left and two executors become available. Suppose for exec1 it is RACK_LOCAL while for exec2 it is NODE_LOCAL We start with PROCESS_LOCAL as maxLocality - and suppose enough time had elapsed so allowedLocality == RACK_LOCAL or ANY. In this case, if resourceOffer is called on exec1 first, we get RACK_LOCAL schedule If resourceOffer was called on exec2 first, we get NODE_LOCAL schedule. The reason for that if condition was exactly to prevent this. I am actually surprised I did not have any testcase to catch this ...
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---