Thanks for sharing this Vlad, this is great read!

I am particularly interested about the last bullet point of one-to-one
mapping in MM since you also mentioned that you use Kafka MM as the async
replication layer for your geo-replicated k-v store. One approach that we
are pursuing here to support active-active is to use a aggregate cluster
that mirror from multiple local clusters from different data centers. But
this approach disables the one-to-one mapping since it requires multiple
sources to pipe to a single destination. How did you tackle this problem at
Turn if you are also using active-active?

Guozhang

On Tue, Mar 24, 2015 at 12:18 PM, vlad...@gmail.com <vlad...@gmail.com>
wrote:

> Dear all,
>
> I had a short discussion with Jay yesterday at the ACM meetup and he
> suggested writing an email regarding a few possible MirrorMaker
> improvements.
>
> At Turn, we have been using MirrorMaker for a a few months now to
> asynchronously replicate our key/value store data between our datacenters.
> In a way, our system is similar to Linkedin's Databus, but it uses Kafka
> clusters and MirrorMaker as its building blocks. Our overall message rate
> peaks at about 650K/sec and, when pushing data over high bandwidth delay
> product links, we have found some minor bottlenecks.
>
> The MirrorMaker process uses a standard consumer to pull data from a remote
> datacenter. This implies that it opens a single TCP connection to each of
> the remote brokers and muxes requests for different topics and partitions
> over this connection. While this is a good thing in terms of maintaining
> the congestion window open, over long RTT lines with rather high loss rate
> the congestion window will cap, in our case at just a few Mbps. While the
> overall line bandwidth is much higher, this means that we have to start
> multiple MirrorMaker processes (somewhere in the hundreds), in order to
> completely use the line capacity. Being able to pool multiple TCP
> connections from a single consumer to a broker would solve this
> complication.
>
> The standard consumer also uses the remote ZooKeeper in order to manage the
> consumer group. While consumer group management is moving closer to the
> brokers, it might make sense to move the group management to the local
> datacenter, since that would avoid using the long-distance connection for
> this purpose.
>
> Another possible improvement assumes a further constraint, namely that the
> number of partitions for a topic in both datacenters is the same. In my
> opinion, this is a sane constraint, since it preserves the Kafka ordering
> guarantees (per partition), instead of a simple guarantee per key. This
> kind of guarantee can be for example useful in a system that compares
> partition contents to reach eventual consistency using Merkle trees. If the
> number of partitions is equal, then offsets have the same meaning for the
> same partition in both clusters, since the data for both partitions is
> identical before the offset. This allows a simple consumer to just inquire
> the local broker and the remote broker for their current offsets and, in
> case the remote broker is ahead, copy the extra data to the local cluster.
> Since the consumer offsets are no longer bound to the specific partitioning
> of a single remote cluster, the consumer could pull from one of any number
> of remote clusters, BitTorrent-style, if their offsets are ahead of the
> local offset. The group management problem would reduce to assigning local
> partitions to different MirrorMaker processes, so the group management
> could be done locally also in this situation.
>
> Regards,
> Vlad
>
> PS: Sorry if this is a double posting! The original posting did not appear
> in the archives for a while.
>



-- 
-- Guozhang

Reply via email to