Hi Experts, I am looking for some information on how to acheive zero data loss while working with kafka and Spark. I have searched online and blogs have different answer. Please let me know if anyone has idea on this.
Blog 1: https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html Blog2: http://aseigneurin.github.io/2016/05/07/spark-kafka-achieving-zero-data-loss.html Blog one simply says configuration change with checkpoint directory and blog 2 give details about on how to save offsets to zoo keeper. can you please help me out with right approach. Thanks, Asmath