[GitHub] spark pull request #20529: [SPARK-23350][SS]Bug fix for exception handling w...

2018-08-14 Thread yanlin-Lynn
Github user yanlin-Lynn closed the pull request at:

https://github.com/apache/spark/pull/20529


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20529: [SPARK-23350][SS]Bug fix for exception handling w...

2018-02-09 Thread yanlin-Lynn
Github user yanlin-Lynn commented on a diff in the pull request:

https://github.com/apache/spark/pull/20529#discussion_r167173873
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
 ---
@@ -92,12 +92,14 @@ case class WriteToDataSourceV2Exec(writer: 
DataSourceWriter, query: SparkPlan) e
 logInfo(s"Data source writer $writer committed.")
   }
 } catch {
-  case _: InterruptedException if writer.isInstanceOf[StreamWriter] =>
-// Interruption is how continuous queries are ended, so accept and 
ignore the exception.
+  case _: SparkException if writer.isInstanceOf[StreamWriter] =>
--- End diff --

I found that when stopping continuous processing application using Ctrl-C, 
it always throws a SparkException.
What I am trying to fix is to remove this exception when stopping a CP 
application.
But this indeed is not a good way to fix it, I will try another way.
 --- The SparkException is as follows ---
18/02/02 16:12:57 ERROR ContinuousExecution: Query yanlin-CP-job [id = 
007f1f44-771a-4097-aaa3-28ae35c16dd9, runId = 
3e1ab7c1-4d6f-475a-9d2c-45577643b0dd] terminated with error
org.apache.spark.SparkException: Writing job failed.
at 
org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:105)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$3$$anonfun$apply$1.apply(ContinuousExecution.scala:268)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$3$$anonfun$apply$1.apply(ContinuousExecution.scala:268)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$3.apply(ContinuousExecution.scala:268)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$3.apply(ContinuousExecution.scala:268)
at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runContinuous(ContinuousExecution.scala:266)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runActivatedStream(ContinuousExecution.scala:90)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189)
Caused by: org.apache.spark.SparkException: Job 0 cancelled because 
SparkContext was shut down
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:837)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:835)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
at 
org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:835)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1831)
at org.apache.spark.util.EventLoop.stop(EventLoop.scala:83)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1743)
at 
org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1924)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1923)
at 
org.apache.spark.SparkContext$$anonfun$2.apply$mcV$sp(SparkContext.scala:572)
at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at 
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)

[GitHub] spark pull request #20529: [SPARK-23350][SS]Bug fix for exception handling w...

2018-02-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/20529#discussion_r166857279
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
 ---
@@ -92,12 +92,14 @@ case class WriteToDataSourceV2Exec(writer: 
DataSourceWriter, query: SparkPlan) e
 logInfo(s"Data source writer $writer committed.")
   }
 } catch {
-  case _: InterruptedException if writer.isInstanceOf[StreamWriter] =>
-// Interruption is how continuous queries are ended, so accept and 
ignore the exception.
+  case _: SparkException if writer.isInstanceOf[StreamWriter] =>
--- End diff --

I agree with @srowen that `SparkException` swallows to much. Also you both 
fixed here and the below lines, not sure which your intentional fix.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20529: [SPARK-23350][SS]Bug fix for exception handling w...

2018-02-07 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/20529#discussion_r166768784
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
 ---
@@ -92,12 +92,14 @@ case class WriteToDataSourceV2Exec(writer: 
DataSourceWriter, query: SparkPlan) e
 logInfo(s"Data source writer $writer committed.")
   }
 } catch {
-  case _: InterruptedException if writer.isInstanceOf[StreamWriter] =>
-// Interruption is how continuous queries are ended, so accept and 
ignore the exception.
+  case _: SparkException if writer.isInstanceOf[StreamWriter] =>
--- End diff --

This seems like it swallows too much now. Any error is interpreted as 
something to ignore. That doesn't sound right nor is it consistent with the 
comment. I don't know the code well though. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20529: [SPARK-23350][SS]Bug fix for exception handling w...

2018-02-07 Thread yanlin-Lynn
GitHub user yanlin-Lynn opened a pull request:

https://github.com/apache/spark/pull/20529

[SPARK-23350][SS]Bug fix for exception handling when stopping continu…

## What changes were proposed in this pull request?

SparkException happends when stopping continuous processing application, 
using Ctrl-C in stand-alone mode. Stopping continuous processing application 
will throw SparkException, not InterruptedException as the code in 
[WriteToDataSourceV2](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala#L95)

## How was this patch tested?

manual tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanlin-Lynn/spark SPARK-23350

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20529.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20529


commit f1106f33c583f9cba0fd31a5783e12ab211d7e81
Author: wangyanlin01 
Date:   2018-02-07T11:13:12Z

[SPARK-23350][SS]Bug fix for exception handling when stopping continuous 
processing application




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org