Just to be clear, spark checkpoints have nothing to do with zookeeper, they're stored in the filesystem you specify.
On Sun, Dec 6, 2015 at 1:25 AM, manasdebashiskar <poorinsp...@gmail.com> wrote: > When you enable check pointing your offsets get written in zookeeper. If > you > program dies or shutdowns and later restarted kafkadirectstream api knows > where to start by looking at those offsets from zookeeper. > > This is as easy as it gets. > However if you are planning to re-use the same checkpoint folder among > different spark version that is currently not supported. > In that case you might want to go for writing the offset and topic in your > favorite database. Assuming that DB is high available you can later retried > the previously worked offset and start from there. > > Take a look at the blog post of cody.(the guy who wrote kafkadirectstream) > https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/streaming-KafkaUtils-createDirectStream-how-to-start-streming-from-checkpoints-tp25461p25597.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >