[
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618885#comment-14618885
]
Neelesh Srinivas Salian commented on SPARK-7736:
------------------------------------------------
My 2 cents:
To have a YARN failed, ApplicationMaster running the driver needs to fail.
Scenario:
1) It fails once, YARN retries and succeeds if the exception has been handled
correctly. This results in a Successful YARN job (assuming the child tasks
(executors) succeeded).
2) The retries fail and the YARN job fails completely.
You need the Spark Application to coz a failure in YARN to mark it as a Failure.
Moreover, the ApplicationMaster.java code from the:
/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
in the Hadoop project should help.
Reference:
[1]
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
So, I would say this is expected behavior.
Hope that helps.
Please add/correct me if needed.
> Exception not failing Python applications (in yarn cluster mode)
> ----------------------------------------------------------------
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
> Reporter: Shay Rojansky
>
> It seems that exceptions thrown in Python spark apps after the SparkContext
> is instantiated don't cause the application to fail, at least in Yarn: the
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the
> application in FAILED state.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]