[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2017-12-07 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282764#comment-16282764
 ] 

Marcelo Vanzin commented on SPARK-7736:
---

Make sure all you guys are running apps in cluster mode if you want to see the 
proper status. I just ran a failing pyspark app in cluster mode to double 
check, and all seems fine.

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2017-12-01 Thread Dmitriy Reshetnikov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275066#comment-16275066
 ] 

Dmitriy Reshetnikov commented on SPARK-7736:


Spark 2.2 still facing that issue.
In my case Azkaban executes Spark Job and finalStatus of this job in Resource 
Manager is SUCCESS in anycase.

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2017-03-15 Thread Yash Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927240#comment-15927240
 ] 

Yash Sharma commented on SPARK-7736:


This does not seem Fixed. The application still completes with SUCCESS status 
even when an exception is thrown from the application.
Spark version 2.0.2.

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-10-11 Thread Shay Rojansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952263#comment-14952263
 ] 

Shay Rojansky commented on SPARK-7736:
--

Have just tested this with Spark 1.5.1 on Yarn 2.7.1 and the problem is still 
there - an exception thrown after the SparkContext has been created terminates 
the application but Yarn reports it as succeeded.

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-09-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14910186#comment-14910186
 ] 

Zsolt Tóth commented on SPARK-7736:
---

Created SPARK-10851.

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-09-25 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908389#comment-14908389
 ] 

Shivaram Venkataraman commented on SPARK-7736:
--

[~ztoth] Could you open a new JIRA for the SparkR problem ?

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908001#comment-14908001
 ] 

Zsolt Tóth commented on SPARK-7736:
---

As I see, this is also a problem for SparkR applications in yarn-cluster mode. 
Is there an open JIRA for that?

> Exception not failing Python applications (in yarn cluster mode)
> 
>
> Key: SPARK-7736
> URL: https://issues.apache.org/jira/browse/SPARK-7736
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
> Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
>Reporter: Shay Rojansky
>Assignee: Marcelo Vanzin
> Fix For: 1.5.1, 1.6.0
>
>
> It seems that exceptions thrown in Python spark apps after the SparkContext 
> is instantiated don't cause the application to fail, at least in Yarn: the 
> application is marked as SUCCEEDED.
> Note that any exception right before the SparkContext correctly places the 
> application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-08-17 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700451#comment-14700451
 ] 

Apache Spark commented on SPARK-7736:
-

User 'vanzin' has created a pull request for this issue:
https://github.com/apache/spark/pull/8258

 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky
Assignee: Marcelo Vanzin
 Fix For: 1.6.0


 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-08-03 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652429#comment-14652429
 ] 

Apache Spark commented on SPARK-7736:
-

User 'vanzin' has created a pull request for this issue:
https://github.com/apache/spark/pull/7751

 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-07-10 Thread Shay Rojansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621832#comment-14621832
 ] 

Shay Rojansky commented on SPARK-7736:
--

Neelesh, not sure I understood what you're saying exactly... I agree with Esben 
that at the end of the day, if a Spark application fails (by throwing an 
exception), and does so on all Yarn application attempts, that the Yarn status 
of that application definitely should be FAILED...

 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-07-09 Thread Esben S. Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620021#comment-14620021
 ] 

Esben S. Nielsen commented on SPARK-7736:
-

Thanks for the comment. I don't understand how it apply here however as both 
listed pyspark programs (In my understanding) should result in step 2) of your 
scenario:

p1) Unhandled execption raised before SparkContext initialization:
---
from pyspark import SparkContext
raise Exception('Fail')
sc = SparkContext(appName=raise_seen_by_yarn)
---
This results in an AM retry (total 2 AM tries as per YARN default) and 
subsequent marking of the application YARN status as FAILED. This is what I 
expect for a designed to fail AM.

p2) Unhandled execption raised after SparkContext initialization:
---
from pyspark import SparkContext
sc = SparkContext(appName=raise_not_seen_by_yarn):
raise Exception('Fail')
---
This results in the the application being marked as SUCCEEDED (total of 1 AM 
try) which is not what I expect for a designed to fail AM.

I've tried to look in the spark documentation if there should be taken special 
actions to signal failure to YARN but I haven't found anything? And looking at 
src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala : L118 
where all sys.exit calls are considered successful termination regardless of 
exit code I can't see a way to signal failure to YARN after SparkContext 
initialization?

Both p1 and p2 return with non-zero exit code when run with spark-submit 
--master yarn-client which is what I would expect.


 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-07-08 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618885#comment-14618885
 ] 

Neelesh Srinivas Salian commented on SPARK-7736:


My 2 cents:

To have a YARN failed, ApplicationMaster running the driver needs to fail. 

Scenario:
1) It fails once, YARN retries and succeeds if the exception has been handled 
correctly. This results in a Successful YARN job (assuming the child tasks 
(executors) succeeded).
2) The retries fail and the YARN job fails completely.
You need the Spark Application to coz a failure in YARN to mark it as a Failure.

Moreover, the ApplicationMaster.java code from the: 
/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
 in the Hadoop project should help. 

Reference: 
[1] 
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
 

So, I would say this is expected behavior.
Hope that helps. 

Please add/correct me if needed.

 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-07-08 Thread Esben S. Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618110#comment-14618110
 ] 

Esben S. Nielsen commented on SPARK-7736:
-

Platform: spark 1.3.0, CDH 5.4.1

To reproduce with pyspark:

---
from pyspark import SparkContext
with SparkContext(appName=raise_uncaught) as sc:
raise Exception('Fail')
---
$ spark-submit --master yarn-cluster /path/to/my/pythonscript.py

This ends up with the following YARN status:
State:  FINISHED
FinalStatus:SUCCEEDED
Diagnostics:Shutdown hook called before final status was reported.

If the exception is thrown before the SparkContext is initialized YARN status 
displays as expected:
---
from pyspark import SparkContext
raise Exception('Fail')
with SparkContext(appName=raise_caught) as sc:
pass
---

This ends up with the following YARN status:
State:  FAILED
FinalStatus:FAILED
Diagnostics: trace

It seems (from the Diagnostics message) that  
https://github.com/apache/spark/blob/19834fa9184f0365a160bcb54bcd33eaa87c70dc/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
 : L118 is hit when exceptions are raised after initializing SparkContext. This 
also means applications are not retried when failures happen after SparkContext 
initialization.


 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-06-25 Thread Shay Rojansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601923#comment-14601923
 ] 

Shay Rojansky commented on SPARK-7736:
--

The problem is simply with the YARN status for the application. If a Spark 
application throws an exception after having instantiated the SparkContext, the 
application obviously terminates but YARN lists the job as SUCCEEDED. This 
makes it hard for users to see what happened to their jobs in the YARN UI.

Let me know if this is still unclear.

 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7736) Exception not failing Python applications (in yarn cluster mode)

2015-06-25 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601914#comment-14601914
 ] 

Neelesh Srinivas Salian commented on SPARK-7736:


Could you add more context to the issue? 
What is the return value / output expected on the applications?



 Exception not failing Python applications (in yarn cluster mode)
 

 Key: SPARK-7736
 URL: https://issues.apache.org/jira/browse/SPARK-7736
 Project: Spark
  Issue Type: Bug
  Components: YARN
 Environment: Spark 1.3.1, Yarn 2.7.0, Ubuntu 14.04
Reporter: Shay Rojansky

 It seems that exceptions thrown in Python spark apps after the SparkContext 
 is instantiated don't cause the application to fail, at least in Yarn: the 
 application is marked as SUCCEEDED.
 Note that any exception right before the SparkContext correctly places the 
 application in FAILED state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org