Re: MM2 for DR

benitocm Tue, 11 Feb 2020 22:05:27 -0800

Hi Ryanne,

Please could you elaborate a bit more about the active-active
recommendation?


Thanks in advance

On Mon, Feb 10, 2020 at 10:21 PM benitocm <benit...@gmail.com> wrote:

> Thanks very much for the response.
>
> Please could you elaborate a bit more about  "I'd
> arc in that direction. Instead of migrating A->B->C->D..., active/active is
> more like having one big cluster".
>
> Another thing that I would like to share is that currently my consumers
> only consumer from one topic so the fact of introducing MM2 will impact
> them.
> Any suggestion in this regard would be greatly appreciated
>
> Thanks in advance again!
>
>
> On Mon, Feb 10, 2020 at 9:40 PM Ryanne Dolan <ryannedo...@gmail.com>
> wrote:
>
>> Hello, sounds like you have this all figured out actually. A couple notes:
>>
>> > For now, we just need to handle DR requirements, i.e., we would not need
>> active-active
>>
>> If your infrastructure is sufficiently advanced, active/active can be a
>> lot
>> easier to manage than active/standby. If you are starting from scratch I'd
>> arc in that direction. Instead of migrating A->B->C->D..., active/active
>> is
>> more like having one big cluster.
>>
>> > secondary.primary.topic1
>>
>> I'd recommend using regex subscriptions where possible, so that apps don't
>> need to worry about these potentially complex topic names.
>>
>> > An additional question. If the topic is compacted, i.e.., the topic
>> keeps
>> > forever, does switchover operations would imply add an additional path
>> in
>> > the topic name?
>>
>> I think that's right. You could always clean things up manually, but
>> migrating between clusters a bunch of times would leave a trail of
>> replication hops.
>>
>> Also, you might look into implementing a custom ReplicationPolicy. For
>> example, you could squash "secondary.primary.topic1" into something
>> shorter
>> if you like.
>>
>> Ryanne
>>
>> On Mon, Feb 10, 2020 at 1:24 PM benitocm <benit...@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > After having a look to the talk
>> >
>> >
>> https://www.confluent.io/kafka-summit-lon19/disaster-recovery-with-mirrormaker-2-0
>> > and the
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382
>> > I am trying to understand how I would use it
>> > in the setup that I have. For now, we just need to handle DR
>> requirements,
>> > i.e., we would not need active-active
>> >
>> > My requirements, more or less, are the following:
>> >
>> > 1) Currently, we have just one Kafka cluster "primary" where all the
>> > producers are producing to and where all the consumers are consuming
>> from.
>> > 2) In case "primary" crashes, we would need to have other Kafka cluster
>> > "secondary" where we will move all the producer and consumers and keep
>> > working.
>> > 3) Once "primary" is recovered, we would need to move to it again (as we
>> > were in #1)
>> >
>> > To fullfill #2, I have thought to have a new Kafka cluster "secondary"
>> and
>> > setup a replication procedure using MM2. However, it is not clear to me
>> how
>> > to proceed.
>> >
>> > I would describe the high level details so you guys can point my
>> > misconceptions:
>> >
>> > A) Initial situation. As in the example of the KIP-382, in the primary
>> > cluster, we will have a local topic: "topic1" where the producers will
>> > produce to and the consumers will consume from. MM2 will create in  the
>> > primary the remote topic "primary.topic1" where the local topic in the
>> > primary will be replicated. In addition, the consumer group information
>> of
>> > primary will be also replicated.
>> >
>> > B) Kafka primary cluster is not available. Producers are moved to
>> produce
>> > into the topic1 that it was manually created. In addition, consumers
>> need
>> > to connect to
>> > secondary to consume the local topic "topic1" where the producers are
>> now
>> > producing and from the remote topic  "primary.topic1" where the
>> producers
>> > were producing before, i.e., consumers will need to aggregate.This is so
>> > because some consumers could have lag so they will need to consume from
>> > both. In this situation, local topic "topic1" in the secondary will be
>> > modified with new messages and will be consumed (its consumption
>> > information will also change) but the remote topic "primary.topic1" will
>> > not receive new messages but it will be consumed  (its consumption
>> > information will change)
>> >
>> > At this point, my conclusion is that consumers needs to consume from
>> both
>> > topics (the new messages produced in the local topic and the old
>> messages
>> > for consumers that had a lag)
>> >
>> > C) primary cluster is recovered (here is when the things get complicated
>> > for me). In the talk, the new primary is renamed a primary-2 and the
>> MM2 is
>> > configured to active-active replication.
>> > The result is the following. The secondary cluster will end up with a
>> new
>> > remote topic (primary-2.topic1) that will contain a replica of the new
>> > topic1 created in the primary-2 cluster. The primary-2 cluster will
>> have 3
>> > topics. "topic1" will be a new topic where in the near future producers
>> > will produce, "secondary.topic1" contains the replica of the local topic
>> > "topic1" in the secondary and "secondary.primary.topic1" that is
>> "topic1"
>> > of the old primary (got through the secondary).
>> >
>> > D) Once all the replicas are in sync, producers and consumers will be
>> moved
>> > to the primary-2. Producers will produce to local topic "topic1" of
>> > primary-2 cluster. The consumers
>> > will connect to primary-2 to consume from "topic1" (new messages that
>> come
>> > in), "secondary.topic1" (messages produced during the outage) and from
>> > "secondary.primary.topic1" (old messages)
>> >
>> > If topics have a retention time, e.g. 7 days, we could remove
>> > "secondary.primary.topic1" after a few days, leaving the situation as at
>> > the beginning. However, if another problem happens in the middle, the
>> > number of topics could be a little difficult to handle.
>> >
>> > An additional question. If the topic is compacted, i.e.., the topic
>> keeps
>> > forever, does switchover operations would imply add an additional path
>> in
>> > the topic name?
>> >
>> > I would appreciate some guidance with this.
>> >
>> > Regards
>> >
>>
>

Re: MM2 for DR

Reply via email to