[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984437#comment-16984437 ] Michal Sankot commented on LIVY-712: It seems that issue was present in EMR 5.23/5.27 (hadoop libraries 2.8.5-amzn-3/2.8.5-amzn-4) and is not present in EMR 5.28 anynore (hadoop libraries 2.8.5-amzn-5). I'm thus closing the issue. > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982583#comment-16982583 ] Michal Sankot commented on LIVY-712: After further investigation, it seems that problem is unrelated to Livy. And that it's instead an issue of AWS-custom hadoop-* libraries used in EMR. I'll keep you posted with results of further investigations. > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981471#comment-16981471 ] Michal Sankot commented on LIVY-712: Sure, * create a Scala Spark job that throws NullPointerException (to make sure job fails) * submit job through Livy to AWS EMR 5.23/5.27 * check state of job execution through Livy API Instead of failed state it should state "state": "success". Are those steps sufficient to reproduce it or would you need more detail? > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979272#comment-16979272 ] Yiheng Wang commented on LIVY-712: -- Can you provide a way to reproduce the issue? > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979271#comment-16979271 ] Yiheng Wang commented on LIVY-712: -- This code is changed in this patch: https://github.com/apache/incubator-livy/commit/ca4cad22968e1a2f88fa0ec262c1088812e3d251 [~jshao] Any suggestion about this? > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)