Re: Zero Data Loss in Spark with Kafka

Sudhir Babu Pothineni Tue, 23 Aug 2016 08:45:40 -0700

saving offsets to zookeeper is old approach, check-pointing internally
saves the offsets to HDFS/location of checkpointing.


more details here:
http://spark.apache.org/docs/latest/streaming-kafka-integration.html

On Tue, Aug 23, 2016 at 10:30 AM, KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:

> Hi Experts,
>
> I am looking for some information on how to acheive zero data loss while
> working with kafka and Spark. I have searched online and blogs have
> different answer. Please let me know if anyone has idea on this.
>
> Blog 1:
> https://databricks.com/blog/2015/01/15/improved-driver-
> fault-tolerance-and-zero-data-loss-in-spark-streaming.html
>
>
> Blog2:
> http://aseigneurin.github.io/2016/05/07/spark-kafka-
> achieving-zero-data-loss.html
>
>
> Blog one simply says configuration change with checkpoint directory and
> blog 2 give details about on how to save offsets to zoo keeper. can you
> please help me out with right approach.
>
> Thanks,
> Asmath
>
>
>

Re: Zero Data Loss in Spark with Kafka

Reply via email to