[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20602#discussion_r168060518
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -369,7 +370,11 @@ abstract class StreamExecution(
 //  exception
 // UncheckedExecutionException - thrown by codes that cannot throw 
a checked
 //   ExecutionException, such as 
BiFunction.apply
-case e2 @ (_: UncheckedIOException | _: ExecutionException | _: 
UncheckedExecutionException)
+// SparkException - thrown if the interrupt happens in the middle 
of an RPC wait
--- End diff --

does it mean this issue is nothing to do with `WriteToDataSourceV2Exec`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread jose-torres
Github user jose-torres commented on a diff in the pull request:

https://github.com/apache/spark/pull/20602#discussion_r167998592
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -369,7 +370,11 @@ abstract class StreamExecution(
 //  exception
 // UncheckedExecutionException - thrown by codes that cannot throw 
a checked
 //   ExecutionException, such as 
BiFunction.apply
-case e2 @ (_: UncheckedIOException | _: ExecutionException | _: 
UncheckedExecutionException)
+// SparkException - thrown if the interrupt happens in the middle 
of an RPC wait
+case e2 @ (_: UncheckedIOException |
+   _: ExecutionException |
+   _: UncheckedExecutionException |
+   _: SparkException)
--- End diff --

SGTM. I'll close this then.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread jose-torres
Github user jose-torres closed the pull request at:

https://github.com/apache/spark/pull/20602


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20602#discussion_r167995867
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -369,7 +370,11 @@ abstract class StreamExecution(
 //  exception
 // UncheckedExecutionException - thrown by codes that cannot throw 
a checked
 //   ExecutionException, such as 
BiFunction.apply
-case e2 @ (_: UncheckedIOException | _: ExecutionException | _: 
UncheckedExecutionException)
+// SparkException - thrown if the interrupt happens in the middle 
of an RPC wait
+case e2 @ (_: UncheckedIOException |
+   _: ExecutionException |
+   _: UncheckedExecutionException |
+   _: SparkException)
--- End diff --

Not really. But as you said, relying on a list of exceptions to be thrown 
is a bit fragile. Can't we rethink this and do a more long term change, rather 
than a quick (and it seems a bit unsafe to me) fix? Currently it is just 
causing test flakiness, so I don't think there is a great need to fix it 
immediately. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread jose-torres
Github user jose-torres commented on a diff in the pull request:

https://github.com/apache/spark/pull/20602#discussion_r167994087
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -369,7 +370,11 @@ abstract class StreamExecution(
 //  exception
 // UncheckedExecutionException - thrown by codes that cannot throw 
a checked
 //   ExecutionException, such as 
BiFunction.apply
-case e2 @ (_: UncheckedIOException | _: ExecutionException | _: 
UncheckedExecutionException)
+// SparkException - thrown if the interrupt happens in the middle 
of an RPC wait
+case e2 @ (_: UncheckedIOException |
+   _: ExecutionException |
+   _: UncheckedExecutionException |
+   _: SparkException)
--- End diff --

I agree with the worry, but I'm not sure I see a better solution.

Other alternatives I can think of are matching against the specific 
exception message string, or changing ThreadUtils.awaitResult() to throw a 
custom exception. Do you have any thoughts?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20602#discussion_r167993164
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -369,7 +370,11 @@ abstract class StreamExecution(
 //  exception
 // UncheckedExecutionException - thrown by codes that cannot throw 
a checked
 //   ExecutionException, such as 
BiFunction.apply
-case e2 @ (_: UncheckedIOException | _: ExecutionException | _: 
UncheckedExecutionException)
+// SparkException - thrown if the interrupt happens in the middle 
of an RPC wait
+case e2 @ (_: UncheckedIOException |
+   _: ExecutionException |
+   _: UncheckedExecutionException |
+   _: SparkException)
--- End diff --

I am not sure especially about this. It can be thrown for a great number of 
cases which aren't necessarily caused by the stop operation. I don't think it 
is a good idea to add it.  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...

2018-02-13 Thread jose-torres
GitHub user jose-torres opened a pull request:

https://github.com/apache/spark/pull/20602

[SPARK-23416][SS] handle streaming interrupts in ThreadUtils.awaitResult()

## What changes were proposed in this pull request?

StreamExecution.isInterruptedByStop() implements a whitelist of exceptions 
which indicate a benign stop() call. The DataSourceV2 write path introduces a 
new kind of exception, SparkException caused by InterruptedException, which 
must be added to this whitelist. (This exception comes from an interrupt in 
ThreadUtils.awaitResult().)

## How was this patch tested?

Existing unit tests. Unfortunately, the underlying flakiness is reasonably 
rare, so I don't have a good idea for how to test that this resolves it.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jose-torres/spark SPARK-23416

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20602.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20602


commit 24a621e1c30ca02436a6725c5646d216bf2d7118
Author: Jose Torres 
Date:   2018-02-13T19:50:01Z

add accommodation for ThreadUtils.awaitResult()




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org