[ 
https://issues.apache.org/jira/browse/SPARK-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902996#comment-14902996
 ] 

Saisai Shao commented on SPARK-10739:
-------------------------------------

Yes, as Sandy mentioned about, SPARK-6735 is focused on executor failure, 
whereas this PR is focused on AM failure, so this is different. Also what I did 
is to pass this parameter to Yarn RM, let Yarn to control the attempt window.

> Add attempt window for long running Spark application on Yarn
> -------------------------------------------------------------
>
>                 Key: SPARK-10739
>                 URL: https://issues.apache.org/jira/browse/SPARK-10739
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>            Reporter: Saisai Shao
>            Priority: Minor
>
> Currently Spark on Yarn uses max attempts to control the failure number, if 
> application's failure number reaches to the max attempts, application will 
> not be recovered by RM, it is not very effective for long running 
> applications, since it will easily exceed the max number at a long time 
> period, also setting a very large max attempts will hide the real problem.
> So here introduce an attempt window to control the application attempt times, 
> this will ignore the out of window attempts, it is introduced in Hadoop 2.6+ 
> to support long running application, it is quite useful for Spark Streaming, 
> Spark shell like applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to