Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/159#discussion_r10689463
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -59,6 +59,15 @@ private[spark] class TaskSetManager(
       // CPUs to request per task
       val CPUS_PER_TASK = conf.getInt("spark.task.cpus", 1)
     
    +  /*
    +   * Sometimes if an executor is dead or in an otherwise invalid state, 
the driver
    +   * does not realize right away leading to repeated task failures. If 
enabled,
    +   * this temporarily prevents a task from re-launching on an executor 
where
    +   * it just failed.
    +   */
    +  private[this] val EXECUTOR_TASK_BLACKLIST_TIMEOUT =
    +    conf.getLong("spark.task.executorBlacklistTimeout", 0L)
    --- End diff --
    
    Yup, that's entirely reasonable.
    
    And fwiw, I gave the naive replace-all-private-with-private[this] a shot, 
and it is more complicated than I anticipated.  I may stick with this for a 
while and may end up with that separate PR -- if for no other reason than that 
it is turning out to be a somewhat interesting way to probe Spark's internals.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to