Hi Ryanne, Please could you elaborate a bit more about the active-active recommendation?
Thanks in advance On Mon, Feb 10, 2020 at 10:21 PM benitocm <benit...@gmail.com> wrote: > Thanks very much for the response. > > Please could you elaborate a bit more about "I'd > arc in that direction. Instead of migrating A->B->C->D..., active/active is > more like having one big cluster". > > Another thing that I would like to share is that currently my consumers > only consumer from one topic so the fact of introducing MM2 will impact > them. > Any suggestion in this regard would be greatly appreciated > > Thanks in advance again! > > > On Mon, Feb 10, 2020 at 9:40 PM Ryanne Dolan <ryannedo...@gmail.com> > wrote: > >> Hello, sounds like you have this all figured out actually. A couple notes: >> >> > For now, we just need to handle DR requirements, i.e., we would not need >> active-active >> >> If your infrastructure is sufficiently advanced, active/active can be a >> lot >> easier to manage than active/standby. If you are starting from scratch I'd >> arc in that direction. Instead of migrating A->B->C->D..., active/active >> is >> more like having one big cluster. >> >> > secondary.primary.topic1 >> >> I'd recommend using regex subscriptions where possible, so that apps don't >> need to worry about these potentially complex topic names. >> >> > An additional question. If the topic is compacted, i.e.., the topic >> keeps >> > forever, does switchover operations would imply add an additional path >> in >> > the topic name? >> >> I think that's right. You could always clean things up manually, but >> migrating between clusters a bunch of times would leave a trail of >> replication hops. >> >> Also, you might look into implementing a custom ReplicationPolicy. For >> example, you could squash "secondary.primary.topic1" into something >> shorter >> if you like. >> >> Ryanne >> >> On Mon, Feb 10, 2020 at 1:24 PM benitocm <benit...@gmail.com> wrote: >> >> > Hi, >> > >> > After having a look to the talk >> > >> > >> https://www.confluent.io/kafka-summit-lon19/disaster-recovery-with-mirrormaker-2-0 >> > and the >> > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382 >> > I am trying to understand how I would use it >> > in the setup that I have. For now, we just need to handle DR >> requirements, >> > i.e., we would not need active-active >> > >> > My requirements, more or less, are the following: >> > >> > 1) Currently, we have just one Kafka cluster "primary" where all the >> > producers are producing to and where all the consumers are consuming >> from. >> > 2) In case "primary" crashes, we would need to have other Kafka cluster >> > "secondary" where we will move all the producer and consumers and keep >> > working. >> > 3) Once "primary" is recovered, we would need to move to it again (as we >> > were in #1) >> > >> > To fullfill #2, I have thought to have a new Kafka cluster "secondary" >> and >> > setup a replication procedure using MM2. However, it is not clear to me >> how >> > to proceed. >> > >> > I would describe the high level details so you guys can point my >> > misconceptions: >> > >> > A) Initial situation. As in the example of the KIP-382, in the primary >> > cluster, we will have a local topic: "topic1" where the producers will >> > produce to and the consumers will consume from. MM2 will create in the >> > primary the remote topic "primary.topic1" where the local topic in the >> > primary will be replicated. In addition, the consumer group information >> of >> > primary will be also replicated. >> > >> > B) Kafka primary cluster is not available. Producers are moved to >> produce >> > into the topic1 that it was manually created. In addition, consumers >> need >> > to connect to >> > secondary to consume the local topic "topic1" where the producers are >> now >> > producing and from the remote topic "primary.topic1" where the >> producers >> > were producing before, i.e., consumers will need to aggregate.This is so >> > because some consumers could have lag so they will need to consume from >> > both. In this situation, local topic "topic1" in the secondary will be >> > modified with new messages and will be consumed (its consumption >> > information will also change) but the remote topic "primary.topic1" will >> > not receive new messages but it will be consumed (its consumption >> > information will change) >> > >> > At this point, my conclusion is that consumers needs to consume from >> both >> > topics (the new messages produced in the local topic and the old >> messages >> > for consumers that had a lag) >> > >> > C) primary cluster is recovered (here is when the things get complicated >> > for me). In the talk, the new primary is renamed a primary-2 and the >> MM2 is >> > configured to active-active replication. >> > The result is the following. The secondary cluster will end up with a >> new >> > remote topic (primary-2.topic1) that will contain a replica of the new >> > topic1 created in the primary-2 cluster. The primary-2 cluster will >> have 3 >> > topics. "topic1" will be a new topic where in the near future producers >> > will produce, "secondary.topic1" contains the replica of the local topic >> > "topic1" in the secondary and "secondary.primary.topic1" that is >> "topic1" >> > of the old primary (got through the secondary). >> > >> > D) Once all the replicas are in sync, producers and consumers will be >> moved >> > to the primary-2. Producers will produce to local topic "topic1" of >> > primary-2 cluster. The consumers >> > will connect to primary-2 to consume from "topic1" (new messages that >> come >> > in), "secondary.topic1" (messages produced during the outage) and from >> > "secondary.primary.topic1" (old messages) >> > >> > If topics have a retention time, e.g. 7 days, we could remove >> > "secondary.primary.topic1" after a few days, leaving the situation as at >> > the beginning. However, if another problem happens in the middle, the >> > number of topics could be a little difficult to handle. >> > >> > An additional question. If the topic is compacted, i.e.., the topic >> keeps >> > forever, does switchover operations would imply add an additional path >> in >> > the topic name? >> > >> > I would appreciate some guidance with this. >> > >> > Regards >> > >> >