Re: [SS] Console sink not supporting recovering from checkpoint location? Why?
Hi Michael, That reflects my sentiments so well. Thanks for having confirmed my thoughts! https://issues.apache.org/jira/browse/SPARK-21667 Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Tue, Aug 8, 2017 at 12:37 AM, Michael Armbrust wrote: > I think there is really no good reason for this limitation. > > On Mon, Aug 7, 2017 at 2:58 AM, Jacek Laskowski wrote: >> >> Hi, >> >> While exploring checkpointing with kafka source and console sink I've >> got the exception: >> >> // today's build from the master >> scala> spark.version >> res8: String = 2.3.0-SNAPSHOT >> >> scala> val q = records. >> | writeStream. >> | format("console"). >> | option("truncate", false). >> | option("checkpointLocation", "/tmp/checkpoint"). // <-- >> checkpoint directory >> | trigger(Trigger.ProcessingTime(10.seconds)). >> | outputMode(OutputMode.Update). >> | start >> org.apache.spark.sql.AnalysisException: This query does not support >> recovering from checkpoint location. Delete /tmp/checkpoint/offsets to >> start over.; >> at >> org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:222) >> at >> org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278) >> at >> org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:284) >> ... 61 elided >> >> The "trigger" is the change >> https://issues.apache.org/jira/browse/SPARK-16116 and this line in >> particular >> https://github.com/apache/spark/pull/13817/files#diff-d35e8fce09686073f81de598ed657de7R277. >> >> Why is this needed? I can't think of a use case where console sink >> could not recover from checkpoint location (since all the information >> is available). I'm lost on it and would appreciate some help (to >> recover :)) >> >> Pozdrawiam, >> Jacek Laskowski >> >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> - >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: [SS] Console sink not supporting recovering from checkpoint location? Why?
I think there is really no good reason for this limitation. On Mon, Aug 7, 2017 at 2:58 AM, Jacek Laskowski wrote: > Hi, > > While exploring checkpointing with kafka source and console sink I've > got the exception: > > // today's build from the master > scala> spark.version > res8: String = 2.3.0-SNAPSHOT > > scala> val q = records. > | writeStream. > | format("console"). > | option("truncate", false). > | option("checkpointLocation", "/tmp/checkpoint"). // <-- > checkpoint directory > | trigger(Trigger.ProcessingTime(10.seconds)). > | outputMode(OutputMode.Update). > | start > org.apache.spark.sql.AnalysisException: This query does not support > recovering from checkpoint location. Delete /tmp/checkpoint/offsets to > start over.; > at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery( > StreamingQueryManager.scala:222) > at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery( > StreamingQueryManager.scala:278) > at org.apache.spark.sql.streaming.DataStreamWriter. > start(DataStreamWriter.scala:284) > ... 61 elided > > The "trigger" is the change > https://issues.apache.org/jira/browse/SPARK-16116 and this line in > particular https://github.com/apache/spark/pull/13817/files#diff- > d35e8fce09686073f81de598ed657de7R277. > > Why is this needed? I can't think of a use case where console sink > could not recover from checkpoint location (since all the information > is available). I'm lost on it and would appreciate some help (to > recover :)) > > Pozdrawiam, > Jacek Laskowski > > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
[SS] Console sink not supporting recovering from checkpoint location? Why?
Hi, While exploring checkpointing with kafka source and console sink I've got the exception: // today's build from the master scala> spark.version res8: String = 2.3.0-SNAPSHOT scala> val q = records. | writeStream. | format("console"). | option("truncate", false). | option("checkpointLocation", "/tmp/checkpoint"). // <-- checkpoint directory | trigger(Trigger.ProcessingTime(10.seconds)). | outputMode(OutputMode.Update). | start org.apache.spark.sql.AnalysisException: This query does not support recovering from checkpoint location. Delete /tmp/checkpoint/offsets to start over.; at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:222) at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278) at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:284) ... 61 elided The "trigger" is the change https://issues.apache.org/jira/browse/SPARK-16116 and this line in particular https://github.com/apache/spark/pull/13817/files#diff-d35e8fce09686073f81de598ed657de7R277. Why is this needed? I can't think of a use case where console sink could not recover from checkpoint location (since all the information is available). I'm lost on it and would appreciate some help (to recover :)) Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski - To unsubscribe e-mail: user-unsubscr...@spark.apache.org