I think the confusion comes from the fact that we are using mirroring to handle geographic distribution not failover. If I understand correctly what Oliver is asking for is something to give fault tolerance not something for distribution. I don't think that is really what the mirroring does out of the box, though technically i suppose you could just reset the offsets and point the consumer at the new cluster and have it start from "now".
I think it would be helpful to document our use case in the mirroring docs since this is not the first time someone has asked about this. -Jay On Mon, Apr 23, 2012 at 10:38 AM, Joel Koshy <jjkosh...@gmail.com> wrote: > Hi Oliver, > > I was reading the mirroring guide and I wonder if it is required that the > > mirror runs it's own zookeeper? > > > > We have a zookeeper cluster running which is used by different > > applications, so can we use that zookeeper cluster for the kafka source > and > > kafka mirror? > > > > You could have a single zookeeper cluster and use different namespaces for > the source/target mirror. However, I don't think it is recommended to use a > remote zookeeper (if you have a cross-DC set up) since that would > potentially mean very high ZK latencies on one of your clusters. > > > > What is the procedure if the kafka source server fails to switch the > > applications to use the mirrored instance? > > > > I don't quite follow this question - can you clarify? The mirror cluster is > pretty much a separate instance. There is no built-in automatic fail-over > if your source cluster goes down. > > > > Are there any backup best practices if we would not use mirroring? > > > > You can use RAID arrays for (local) data redundancy. You may also be > interested in the (intra-DC) replication feature (KAFKA-50) that is > currently being developed. I believe some folks on this list have also used > plain rsync's as an alternative to mirroring. > > Thanks, > > Joel >