[ https://issues.apache.org/jira/browse/SPARK-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065143#comment-14065143 ]
Bill Havanki commented on SPARK-2083: ------------------------------------- Pull request available: https://github.com/apache/spark/pull/1465 (Please feel free to assign this ticket to me - I don't have that permission.) > Allow local task to retry after failure. > ---------------------------------------- > > Key: SPARK-2083 > URL: https://issues.apache.org/jira/browse/SPARK-2083 > Project: Spark > Issue Type: Improvement > Components: Deploy > Affects Versions: 1.0.0 > Reporter: Peng Cheng > Priority: Trivial > Labels: easyfix > Original Estimate: 1h > Remaining Estimate: 1h > > If a job is submitted to run locally using masterURL = "local[X]", spark will > not retry a failed task regardless of your "spark.task.maxFailures" setting. > This design is to facilitate debugging and QA of spark application where all > tasks are expected to succeed and yield a results. Unfortunately, such > setting will prevent a local job from finished if any of its task cannot > guarantee a result (e.g. visiting an external resouce/API), and retrying > inside the task is less favoured (e.g. the task needs to be executed on a > different computer on production). > User however can still set masterURL ="local[X,Y]" to override this (where Y > is the local maxFailures), but it is not documented and hard to manage. A > quick fix to this can be to add a new configuration property > "spark.local.maxFailures" with a default value of 1. So user knows exactly > where to change when reading the documentation -- This message was sent by Atlassian JIRA (v6.2#6252)