[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/19407#discussion_r151838606 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -267,11 +267,12 @@ final class DataStreamWriter[T] private[sql](ds: Dataset[T]) { useTempCheckpointLocation = true, trigger = trigger) } else { - val (useTempCheckpointLocation, recoverFromCheckpointLocation) = + val recoverFromCheckpointLocation = true + val useTempCheckpointLocation = if (source == "console") { - (true, true) + true } else { - (false, true) + false --- End diff -- Do we really need it anymore since the `if` expression is just `source == "console"`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19407 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
Github user rekhajoshm commented on a diff in the pull request: https://github.com/apache/spark/pull/19407#discussion_r150330668 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -267,11 +267,12 @@ final class DataStreamWriter[T] private[sql](ds: Dataset[T]) { useTempCheckpointLocation = true, trigger = trigger) } else { - val (useTempCheckpointLocation, recoverFromCheckpointLocation) = + val recoverFromCheckpointLocation = true + val useTempCheckpointLocation = --- End diff -- done. thanks @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/19407#discussion_r150314230 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -267,11 +267,12 @@ final class DataStreamWriter[T] private[sql](ds: Dataset[T]) { useTempCheckpointLocation = true, trigger = trigger) } else { - val (useTempCheckpointLocation, recoverFromCheckpointLocation) = + val recoverFromCheckpointLocation = true + val useTempCheckpointLocation = --- End diff -- nit: `val useTempCheckpointLocation = source == "console"` you can just also update the below statement to ``` df.sparkSession.sessionState.streamingQueryManager.startQuery( extraOptions.get("queryName"), extraOptions.get("checkpointLocation"), df, dataSource.createSink(outputMode), outputMode, useTempCheckpointLocation = source == "console", recoverFromCheckpointLocation = true, trigger = trigger) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
Github user rekhajoshm commented on a diff in the pull request: https://github.com/apache/spark/pull/19407#discussion_r142023140 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -269,7 +269,7 @@ final class DataStreamWriter[T] private[sql](ds: Dataset[T]) { } else { val (useTempCheckpointLocation, recoverFromCheckpointLocation) = if (source == "console") { - (true, false) + (true, true) --- End diff -- Good point @jaceklaskowski updated.thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/19407#discussion_r142022819 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -269,7 +269,7 @@ final class DataStreamWriter[T] private[sql](ds: Dataset[T]) { } else { val (useTempCheckpointLocation, recoverFromCheckpointLocation) = if (source == "console") { - (true, false) + (true, true) --- End diff -- Is there any source that uses `recoverFromCheckpointLocation` disabled? What's the use case if any? Remove `recoverFromCheckpointLocation` here as it's always `true` and make it explicit. The JIRA issue is to fix the exception followed by cleaning the code that was needed in the past. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19407: [SPARK-21667][Streaming] ConsoleSink should not f...
GitHub user rekhajoshm opened a pull request: https://github.com/apache/spark/pull/19407 [SPARK-21667][Streaming] ConsoleSink should not fail streaming query with checkpointLocation option ## What changes were proposed in this pull request? Fix to allow recovery on console , avoid checkpoint exception ## How was this patch tested? existing tests manual tests [ Replicating error and seeing no checkpoint error after fix] You can merge this pull request into a Git repository by running: $ git pull https://github.com/rekhajoshm/spark SPARK-21667 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19407.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19407 commit e3677c9fa9697e0d34f9df52442085a6a481c9e9 Author: Rekha Joshi Date: 2015-05-05T23:10:08Z Merge pull request #1 from apache/master Pulling functionality from apache spark commit 106fd8eee8f6a6f7c67cfc64f57c1161f76d8f75 Author: Rekha Joshi Date: 2015-05-08T21:49:09Z Merge pull request #2 from apache/master pull latest from apache spark commit 0be142d6becba7c09c6eba0b8ea1efe83d649e8c Author: Rekha Joshi Date: 2015-06-22T00:08:08Z Merge pull request #3 from apache/master Pulling functionality from apache spark commit 6c6ee12fd733e3f9902e10faf92ccb78211245e3 Author: Rekha Joshi Date: 2015-09-17T01:03:09Z Merge pull request #4 from apache/master Pulling functionality from apache spark commit b123c601e459d1ad17511fd91dd304032154882a Author: Rekha Joshi Date: 2015-11-25T18:50:32Z Merge pull request #5 from apache/master pull request from apache/master commit c73c32aadd6066e631956923725a48d98a18777e Author: Rekha Joshi Date: 2016-03-18T19:13:51Z Merge pull request #6 from apache/master pull latest from apache spark commit 7dbf7320057978526635bed09dabc8cf8657a28a Author: Rekha Joshi Date: 2016-04-05T20:26:40Z Merge pull request #8 from apache/master pull latest from apache spark commit 5e9d71827f8e2e4d07027281b80e4e073e7fecd1 Author: Rekha Joshi Date: 2017-05-01T23:00:30Z Merge pull request #9 from apache/master Pull apache spark commit 63d99b3ce5f222d7126133170a373591f0ac67dd Author: Rekha Joshi Date: 2017-09-30T22:26:44Z Merge pull request #10 from apache/master pull latest apache spark commit 57e0e26474b66afd3bd54be061a5982836e28792 Author: rjoshi2 Date: 2017-10-01T06:57:12Z [SPARK-21667][Streaming] ConsoleSink should not fail streaming query with checkpointLocation option --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org