Hi Federico, I have only a few high level questions.
svv1. There is some previous work in KIP-986: https://cwiki.apache.org/confluence/display/KAFKA/KIP-986%3A+Cross-Cluster+Replication. As I see it, your proposal seems more targeted than that one. I'm curious whether you considered that during the writing? It might be useful to compare them as there are some differences, although the 2 KIPs try to achieve the same. svv2. One of advantages of MM2 is that it separates replication, which therefore won't put extra load on the brokers and also it limits the blast radius of any failures. The KIP separates thread pools and resources inside the brokers, but it essentially says that replicators should scale vertically to handle the extra replication traffic. This I think creates brokers where failures have more impact. My suggestion would be to consider replicator only nodes. Nodes where only cross-cluster replication and client traffic happens make it much easier to plan and scale a cluster. Also it would possibly keep the internals of the brokers mostly untouched and we would also segregate replication traffic from normal brokers. We should be able to keep the benefits of the KIP (exact offset mapping) but also gain some of the improvements brought by MM2 (separated failure mechanism). svv3. As you write, tiered storage and diskless topics are out of the scope. While I agree with the latter as the KIP is currently not yet implemented, I miss the tiered storage parts. I think we should benefit at least from a high level plan to see that your proposal is sound and that we wouldn't need any major refactors when designing the tiered storage cross-cluster replication. svv4. Lastly, I think we shouldn't call KIP-1279 "cluster mirroring" as it is very confusing with the current mirror maker terminology. Let's not overload that. "Cluster-linking" or "replicator nodes" may sound better. What do you think? Best, Viktor On Sat, Feb 14, 2026 at 9:38 PM vaquar khan <[email protected]> wrote: > Hi Fede, > > I reviewed the KIP-1279 proposal yesterday and corrected the KIP number. I > now have time to share my very detailed observations. While I fully support > the goal of removing the operational complexity of Kafka , the design > appears to trade that complexity for broker stability. > > By moving WAN replication into the broker’s core runtime, we are > effectively removing the failure domain isolation that MirrorMaker 2 > provides. We risk coupling the stability of our production clusters to the > instability of cross-datacenter networks.Before this KIP moves to a vote, I > strongly recommend you and other authors to address the following stability > gaps. Without concrete answers here, the risk profile is likely too high > for mission-critical deployments. > > 1. The Thundering Herd and Memory Isolation Risk > In the current architecture, MirrorMaker 2 (MM2) Connect workers provide a > physical failure domain through a separate JVM heap. This isolates the > broker from the memory pressure and Garbage Collection (GC) impact caused > by replication surges. In this proposal, that pressure hits the broker’s > core runtime directly. > > The Gap: We need simulation data for a sustained link outage (e.g., 6 hours > on 10Gbps). When 5,000 partitions resume fetching, does the resulting > backfill I/O and heap pressure cause GC pauses that push P99 Produce > latency on the target cluster over 10ms? We must ensure that a massive > catch-up phase does not starve the broker's Request Handler threads or > destabilize the JVM. > > > 2. Blast Radius (Poison Pill Problem) > The Gap: If a source broker sends a malformed batch (e.g., bit rot), does > it crash the entire broker process? In MM2, this kills a single task. We > need confirmation that exceptions are isolated to the replication thread > pool and will not trigger a node-wide panic. > > 3. Control Plane Saturation > The Gap: How does the system handle a "link flap" event where 50,000 > partitions transition states rapidly? We need to verify that the resulting > flood of metadata updates will not block the Controller from processing > critical ISR changes for local topics. > > 4. Transactional Integrity > "Byte-for-byte" replication copies transaction markers but not the > Coordinator’s state (PIDs). > The Gap: How does the destination broker validate an aborted transaction > without the source PID? We should avoid creating "zombie" transactions that > look valid but cannot be authoritatively managed. > > 5. Infinite Loop Prevention > Since byte-for-byte precludes injecting lineage headers e.g., dc-source, we > lose the standard mechanism for detecting loops in mesh topologies (A→B→A). > The Gap: Relying solely on topic naming conventions is operationally > fragile. What is the deterministic mechanism to prevent infinite recursion? > > 6. Data Divergence and Epoch Reconciliation > The current proposal explicitly excludes support for unclean leader > election because there is no mechanism for a "shared leader epoch" between > clusters. > The Gap: Without epoch reconciliation, if the source cluster experiences an > unclean election, the source and destination logs will diverge. If an > operator later attempts a failback (reverse mirroring), the clusters will > contain inconsistent data for the same offset, leading to potential silent > data corruption or permanent replication failure. > > 7. Tiered Storage Operational Gaps > The design states that Tiered Storage is not initially supported and that a > mirror follower encountering an OffsetMovedToTieredStorageException will > simply mark the partition as FAILED. > The Gap: For mission-critical clusters using Tiered Storage for long-term > retention, this creates an operational cliff. Mirroring will fail as soon > as the source cluster offloads data to remote storage. We need a roadmap > for how native mirroring will eventually interact with tiered segments > without failing the partition. > > 8. Transactional State and PID Mapping > While the KIP proposes a deterministic formula for rewriting Producer IDs > ,calculated as destinationProducerId= (sourceProducerId+2) it does not > replicate the transaction_state metadata. > The Gap: How does the destination broker authoritatively validate or expire > hanging transactions if the source PID state is rewritten but the > transaction coordinator state is missing? > We risk a scenario where consumers encounter zombie transactions that can > never be decided on the destination cluster. > > This is a big change to how our system is built. We need to make sure it > doesn't create a weak link that could bring the whole system down,We should > ensure it does not introduce a new single point of failure. > > Regards, > Viquar Khan > *Linkedin *-https://www.linkedin.com/in/vaquar-khan-b695577/ > *Book *- > > https://us.amazon.com/stores/Vaquar-Khan/author/B0DMJCG9W6?ref=ap_rdr&shoppingPortalEnabled=true > > *GitBook*-https://vaquarkhan.github.io/microservices-recipes-a-free-gitbook/ > <https://us.amazon.com/stores/Vaquar-Khan/author/B0DMJCG9W6?ref=ap_rdr&shoppingPortalEnabled=true*GitBook*-https://vaquarkhan.github.io/microservices-recipes-a-free-gitbook/> > *Stack *-https://stackoverflow.com/users/4812170/vaquar-khan > *github*-https://github.com/vaquarkhan > > On Sat, 14 Feb 2026 at 01:18, Federico Valeri <[email protected]> > wrote: > > > Hi, we would like to start a discussion thread about KIP-1279: Cluster > > Mirroring. > > > > Cluster Mirroring is a new Kafka feature that enables native, > > broker-level topic replication across clusters. Unlike MirrorMaker 2 > > (which runs as an external Connect-based tool), Cluster Mirroring is > > built into the broker itself, allowing tighter integration with the > > controller, coordinator, and partition lifecycle. > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1279%3A+Cluster+Mirroring > > > > There are a few missing bits, but most of the design is there, so we > > think it is the right time to involve the community and get feedback. > > Please help validating our approach. > > > > Thanks > > Fede > > >
