Re: [SS] Console sink not supporting recovering from checkpoint location? Why?

2017-08-08 Thread Jacek Laskowski
Hi Michael,

That reflects my sentiments so well. Thanks for having confirmed my thoughts!

https://issues.apache.org/jira/browse/SPARK-21667

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Tue, Aug 8, 2017 at 12:37 AM, Michael Armbrust
 wrote:
> I think there is really no good reason for this limitation.
>
> On Mon, Aug 7, 2017 at 2:58 AM, Jacek Laskowski  wrote:
>>
>> Hi,
>>
>> While exploring checkpointing with kafka source and console sink I've
>> got the exception:
>>
>> // today's build from the master
>> scala> spark.version
>> res8: String = 2.3.0-SNAPSHOT
>>
>> scala> val q = records.
>>  |   writeStream.
>>  |   format("console").
>>  |   option("truncate", false).
>>  |   option("checkpointLocation", "/tmp/checkpoint"). // <--
>> checkpoint directory
>>  |   trigger(Trigger.ProcessingTime(10.seconds)).
>>  |   outputMode(OutputMode.Update).
>>  |   start
>> org.apache.spark.sql.AnalysisException: This query does not support
>> recovering from checkpoint location. Delete /tmp/checkpoint/offsets to
>> start over.;
>>   at
>> org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:222)
>>   at
>> org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278)
>>   at
>> org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:284)
>>   ... 61 elided
>>
>> The "trigger" is the change
>> https://issues.apache.org/jira/browse/SPARK-16116 and this line in
>> particular
>> https://github.com/apache/spark/pull/13817/files#diff-d35e8fce09686073f81de598ed657de7R277.
>>
>> Why is this needed? I can't think of a use case where console sink
>> could not recover from checkpoint location (since all the information
>> is available). I'm lost on it and would appreciate some help (to
>> recover :))
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [SS] Console sink not supporting recovering from checkpoint location? Why?

2017-08-07 Thread Michael Armbrust
I think there is really no good reason for this limitation.

On Mon, Aug 7, 2017 at 2:58 AM, Jacek Laskowski  wrote:

> Hi,
>
> While exploring checkpointing with kafka source and console sink I've
> got the exception:
>
> // today's build from the master
> scala> spark.version
> res8: String = 2.3.0-SNAPSHOT
>
> scala> val q = records.
>  |   writeStream.
>  |   format("console").
>  |   option("truncate", false).
>  |   option("checkpointLocation", "/tmp/checkpoint"). // <--
> checkpoint directory
>  |   trigger(Trigger.ProcessingTime(10.seconds)).
>  |   outputMode(OutputMode.Update).
>  |   start
> org.apache.spark.sql.AnalysisException: This query does not support
> recovering from checkpoint location. Delete /tmp/checkpoint/offsets to
> start over.;
>   at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(
> StreamingQueryManager.scala:222)
>   at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(
> StreamingQueryManager.scala:278)
>   at org.apache.spark.sql.streaming.DataStreamWriter.
> start(DataStreamWriter.scala:284)
>   ... 61 elided
>
> The "trigger" is the change
> https://issues.apache.org/jira/browse/SPARK-16116 and this line in
> particular https://github.com/apache/spark/pull/13817/files#diff-
> d35e8fce09686073f81de598ed657de7R277.
>
> Why is this needed? I can't think of a use case where console sink
> could not recover from checkpoint location (since all the information
> is available). I'm lost on it and would appreciate some help (to
> recover :))
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


[SS] Console sink not supporting recovering from checkpoint location? Why?

2017-08-07 Thread Jacek Laskowski
Hi,

While exploring checkpointing with kafka source and console sink I've
got the exception:

// today's build from the master
scala> spark.version
res8: String = 2.3.0-SNAPSHOT

scala> val q = records.
 |   writeStream.
 |   format("console").
 |   option("truncate", false).
 |   option("checkpointLocation", "/tmp/checkpoint"). // <--
checkpoint directory
 |   trigger(Trigger.ProcessingTime(10.seconds)).
 |   outputMode(OutputMode.Update).
 |   start
org.apache.spark.sql.AnalysisException: This query does not support
recovering from checkpoint location. Delete /tmp/checkpoint/offsets to
start over.;
  at 
org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:222)
  at 
org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:278)
  at 
org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:284)
  ... 61 elided

The "trigger" is the change
https://issues.apache.org/jira/browse/SPARK-16116 and this line in
particular 
https://github.com/apache/spark/pull/13817/files#diff-d35e8fce09686073f81de598ed657de7R277.

Why is this needed? I can't think of a use case where console sink
could not recover from checkpoint location (since all the information
is available). I'm lost on it and would appreciate some help (to
recover :))

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org