Kafka failover with multiple data centers

nguyen duc Tuan Sun, 05 Mar 2017 18:52:23 -0800

Hi everyone,
We are deploying kafka cluster for ingesting streaming data. But sometimes,
some of nodes on the cluster have troubles (node dies, kafka daemon is
killed...). However, Recovering data in Kafka can be very slow. It takes
serveral hours to recover from disaster. I saw a slide here suggesting
using multiple data centers (
https://www.slideshare.net/HadoopSummit/building-largescale-stream-infrastructures-across-multiple-data-centers-with-apache-kafka).
But I wonder, how can we detect the problem and switch between datacenters
in Spark Streaming? Since kafka 0.10.1 support timestamp index, how can
seek to right offsets?
Are there any opensource library out there that supports handling the
problem on the fly?
Thanks.

Kafka failover with multiple data centers

Reply via email to