[ https://issues.apache.org/jira/browse/SPARK-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174507#comment-14174507 ]
Marcelo Vanzin commented on SPARK-3877: --------------------------------------- [~tgraves] this can be seen as a subset of SPARK-2167, but as I mentioned on that bug, I don't think it's fixable for all cases. SparkSubmit is executing user code, so it can only report errors when the user code does. e.g., a job like this would report an error today {code} val sc = ... try { // do stuff if (somethingBad) throw MyJobFailedException() } finally { sc.stop() } {code} But this one wouldn't: {code} val sc = ... try { // do stuff if (somethingBad) throw MyJobFailedException() } catch { case e: Exception => logError("Oops, something bad happened.", e) } finally { sc.stop() } {code} yarn-client mode will abruptly stop the SparkContext when the Yarn app fails. But depending on how the user's {main()} deals with errors, that still may not result in a non-zero exit status. > The exit code of spark-submit is still 0 when an yarn application fails > ----------------------------------------------------------------------- > > Key: SPARK-3877 > URL: https://issues.apache.org/jira/browse/SPARK-3877 > Project: Spark > Issue Type: Bug > Components: YARN > Reporter: Shixiong Zhu > Priority: Minor > Labels: yarn > > When an yarn application fails (yarn-cluster mode), the exit code of > spark-submit is still 0. It's hard for people to write some automatic scripts > to run spark jobs in yarn because the failure can not be detected in these > scripts. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org