geric created KAFKA-19607: ----------------------------- Summary: MirrorMaker2 Offset Replication Issue Key: KAFKA-19607 URL: https://issues.apache.org/jira/browse/KAFKA-19607 Project: Kafka Issue Type: Bug Components: mirrormaker Affects Versions: 4.0.0 Reporter: geric
I am using *Apache Kafka 4.0* with *MirrorMaker 2* to link the primary cluster ({*}clusterA{*}) to the secondary cluster ({*}clusterB{*}). The secondary cluster will not have any producers or consumers until a disaster recovery event occurs, at which point all producers and consumers will switch to it. *Setup:* * Dedicated standalone MirrorMaker 2 node * {{IdentityReplicationPolicy}} (no topic renaming) * No clients connected to secondary cluster under normal operation *MirrorMaker 2 config:* {{# Cluster aliases clusters = clusterA, clusterB # Bootstrap servers clusterA.bootstrap.servers = serverA-kafka-1:9092 clusterB.bootstrap.servers = serverB-kafka-1:9092 # Replication policy replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy # Offset/Checkpoint sync emit.checkpoints.enabled=true emit.checkpoints.interval.seconds=5 sync.group.offsets.enabled=true sync.group.offsets.interval.seconds=5 offset.lag.max=10 refresh.topics.interval.seconds=5}} ---- h3. Test results: # *Produce 300 messages when MirrorMaker is running* *Expected:* Topic offset synced within a minute *Result:* ✅ Passed # *Consume 100 messages when MirrorMaker is running, then terminate the consumer* *Expected:* Consumer offset synced *Result:* ❌ Failed — offset is not synced to clusterB # *Restart MirrorMaker after test #2* *Expected:* Consumer offset synced *Result:* ✅ Passed # *Repeat test #2 — consume 100 messages when MirrorMaker is running, then terminate the consumer* *Expected:* Consumer offset synced *Result:* ❌ Failed — offset is not synced to clusterB # *Restart MirrorMaker after test #4* *Expected:* Consumer offset synced *Result:* ❌ Failed — offset is not synced to clusterB # *Consume messages but keep consumer running* *Expected:* Offset synced *Result:* ✅ Passed ---- h3. Problem: Consumer offsets appear to only sync in these cases: # When MirrorMaker is restarted and the consumer offset does *not* already exist in the secondary cluster (initial sync), or # When the consumer is still connected at the time of sync, *or* when the consumer has reached the end of the offset (i.e., consumed all available messages). However, if the consumer exits immediately after consuming some messages (but {*}before reaching the end of the topic{*}), the committed offset is *never synced* to the target cluster. ---- h3. Additional Context / Related Issues This problem seems related to an open discussion in the Apache Kafka mailing list: *MirrorCheckpointConnector does not replicate final batch of offsets* [https://lists.apache.org/thread/dxn9jyotl00f7ov541299cd8tlcl1z00] -- This message was sent by Atlassian Jira (v8.20.10#820010)