[GitHub] spark pull request: SPARK-1937: fix issue with task locality

mateiz Fri, 06 Jun 2014 10:11:09 -0700

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/892#discussion_r13500107
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -54,8 +54,15 @@ private[spark] class TaskSetManager(
         clock: Clock = SystemClock)
       extends Schedulable with Logging
     {
    +  // Remember when this TaskSetManager is created
    +  val creationTime = clock.getTime()
       val conf = sched.sc.conf
     
    +  // The period we wait for new executors to come up
    +  // After this period, tasks in pendingTasksWithNoPrefs will be 
considered as PROCESS_LOCAL
    +  private val WAIT_NEW_EXEC_TIMEOUT = 
conf.getLong("spark.scheduler.waitNewExecutorTime", 3000L)
    --- End diff --
    
    It doesn't make sense to put this here because it will apply to every 
TaskSet, no matter how late into the application it was submitted, so you'll 
get a 3-second latency on every TaskSet that is missing one of its preferred 
nodes. Can we not add this as part of this patch, and simply make the change to 
put tasks in the node- and rack-local lists even if no nodes are available in 
those right now? Then later we can update the code that calls resourceOffer to 
treat tasks that have preferred locations but are missing executors for them 
specially.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1937: fix issue with task locality

Reply via email to