[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20602#discussion_r168060518 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -369,7 +370,11 @@ abstract class StreamExecution( // exception // UncheckedExecutionException - thrown by codes that cannot throw a checked // ExecutionException, such as BiFunction.apply -case e2 @ (_: UncheckedIOException | _: ExecutionException | _: UncheckedExecutionException) +// SparkException - thrown if the interrupt happens in the middle of an RPC wait --- End diff -- does it mean this issue is nothing to do with `WriteToDataSourceV2Exec`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
Github user jose-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/20602#discussion_r167998592 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -369,7 +370,11 @@ abstract class StreamExecution( // exception // UncheckedExecutionException - thrown by codes that cannot throw a checked // ExecutionException, such as BiFunction.apply -case e2 @ (_: UncheckedIOException | _: ExecutionException | _: UncheckedExecutionException) +// SparkException - thrown if the interrupt happens in the middle of an RPC wait +case e2 @ (_: UncheckedIOException | + _: ExecutionException | + _: UncheckedExecutionException | + _: SparkException) --- End diff -- SGTM. I'll close this then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
Github user jose-torres closed the pull request at: https://github.com/apache/spark/pull/20602 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20602#discussion_r167995867 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -369,7 +370,11 @@ abstract class StreamExecution( // exception // UncheckedExecutionException - thrown by codes that cannot throw a checked // ExecutionException, such as BiFunction.apply -case e2 @ (_: UncheckedIOException | _: ExecutionException | _: UncheckedExecutionException) +// SparkException - thrown if the interrupt happens in the middle of an RPC wait +case e2 @ (_: UncheckedIOException | + _: ExecutionException | + _: UncheckedExecutionException | + _: SparkException) --- End diff -- Not really. But as you said, relying on a list of exceptions to be thrown is a bit fragile. Can't we rethink this and do a more long term change, rather than a quick (and it seems a bit unsafe to me) fix? Currently it is just causing test flakiness, so I don't think there is a great need to fix it immediately. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
Github user jose-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/20602#discussion_r167994087 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -369,7 +370,11 @@ abstract class StreamExecution( // exception // UncheckedExecutionException - thrown by codes that cannot throw a checked // ExecutionException, such as BiFunction.apply -case e2 @ (_: UncheckedIOException | _: ExecutionException | _: UncheckedExecutionException) +// SparkException - thrown if the interrupt happens in the middle of an RPC wait +case e2 @ (_: UncheckedIOException | + _: ExecutionException | + _: UncheckedExecutionException | + _: SparkException) --- End diff -- I agree with the worry, but I'm not sure I see a better solution. Other alternatives I can think of are matching against the specific exception message string, or changing ThreadUtils.awaitResult() to throw a custom exception. Do you have any thoughts? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20602#discussion_r167993164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -369,7 +370,11 @@ abstract class StreamExecution( // exception // UncheckedExecutionException - thrown by codes that cannot throw a checked // ExecutionException, such as BiFunction.apply -case e2 @ (_: UncheckedIOException | _: ExecutionException | _: UncheckedExecutionException) +// SparkException - thrown if the interrupt happens in the middle of an RPC wait +case e2 @ (_: UncheckedIOException | + _: ExecutionException | + _: UncheckedExecutionException | + _: SparkException) --- End diff -- I am not sure especially about this. It can be thrown for a great number of cases which aren't necessarily caused by the stop operation. I don't think it is a good idea to add it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20602: [SPARK-23416][SS] handle streaming interrupts in ...
GitHub user jose-torres opened a pull request: https://github.com/apache/spark/pull/20602 [SPARK-23416][SS] handle streaming interrupts in ThreadUtils.awaitResult() ## What changes were proposed in this pull request? StreamExecution.isInterruptedByStop() implements a whitelist of exceptions which indicate a benign stop() call. The DataSourceV2 write path introduces a new kind of exception, SparkException caused by InterruptedException, which must be added to this whitelist. (This exception comes from an interrupt in ThreadUtils.awaitResult().) ## How was this patch tested? Existing unit tests. Unfortunately, the underlying flakiness is reasonably rare, so I don't have a good idea for how to test that this resolves it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jose-torres/spark SPARK-23416 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20602.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20602 commit 24a621e1c30ca02436a6725c5646d216bf2d7118 Author: Jose Torres Date: 2018-02-13T19:50:01Z add accommodation for ThreadUtils.awaitResult() --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org