Athanasios Fanos created KAFKA-10857: ----------------------------------------
Summary: Mirror Maker 2 - replication not working when deploying multiple instances Key: KAFKA-10857 URL: https://issues.apache.org/jira/browse/KAFKA-10857 Project: Kafka Issue Type: Bug Components: KafkaConnect, mirrormaker Affects Versions: 2.5.1, 2.6.0 Reporter: Athanasios Fanos We believe we are experiencing a bug when deploying Mirror Maker 2 in distributed mode in our environments. Replication does not work consistently after initial deployment and does not start working even after some time (24h+). *Environment & replication set-up* * 2 regions with a separate Kafka cluster (let's call them Region A and Region B) * 3 instances of Mirror maker are deployed at the same time in Region B with the same configuration * Replication is set up to be bi-directional (regionA->regionB & regionB->regionA) *Container Version* Observed with both {{confluentinc/cp-kafka:5.5.1}} & {{confluentinc/cp-kafka:6.0.1}} *Mirror maker 2 configuration* {code:java} clusters=regionA,regionB regionA.bootstrap.servers=regionA-kafka:9092 regionB.bootstrap.servers=regionB-kafka:9092 regionA->regionB.enabled=true regionA->regionB.topics=testTopic regionB->regionA.enabled=true regionB->regionA.topics=testTopic sync.topic.acls.enabled=false tasks.max=9 {code} *Observed behavior* * After deploying the 3 Mirror Maker instances (at the same time), replication for 1 or both mirrors does not work ** If we scale down to a single instance of mirror maker and wait for about 5 minutes (refresh.topics.interval.seconds?) replication starts working. After this scaling up to 3 correctly distributes the load between the deployed instances *Expected behavior* * Replication should work for all configured mirrors when running in distributed mode * When starting multiple instances of Mirror Maker at the same time replication should work, 1 by 1 rollout should not be required *Additional details* * When replication is not working, we observe that in the internal config topics from Mirror Maker the partitions are not assigned to the tasks, eg {{task.assigned.partitions}} are not set at all under the properties object. *Workaround* * As a workaround, we start Mirror Maker instances 1 by 1 with some delay between each instance. This allows for the first instance to set-up the configuration in the internal topics correctly. Doing this seems to ensure that replication works as expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)