Athanasios Fanos created KAFKA-10857:
----------------------------------------
Summary: Mirror Maker 2 - replication not working when deploying
multiple instances
Key: KAFKA-10857
URL: https://issues.apache.org/jira/browse/KAFKA-10857
Project: Kafka
Issue Type: Bug
Components: KafkaConnect, mirrormaker
Affects Versions: 2.5.1, 2.6.0
Reporter: Athanasios Fanos
We believe we are experiencing a bug when deploying Mirror Maker 2 in
distributed mode in our environments. Replication does not work consistently
after initial deployment and does not start working even after some time (24h+).
*Environment & replication set-up*
* 2 regions with a separate Kafka cluster (let's call them Region A and Region
B)
* 3 instances of Mirror maker are deployed at the same time in Region B with
the same configuration
* Replication is set up to be bi-directional (regionA->regionB &
regionB->regionA)
*Container Version*
Observed with both {{confluentinc/cp-kafka:5.5.1}} &
{{confluentinc/cp-kafka:6.0.1}}
*Mirror maker 2 configuration*
{code:java}
clusters=regionA,regionB
regionA.bootstrap.servers=regionA-kafka:9092
regionB.bootstrap.servers=regionB-kafka:9092
regionA->regionB.enabled=true
regionA->regionB.topics=testTopic
regionB->regionA.enabled=true
regionB->regionA.topics=testTopic
sync.topic.acls.enabled=false
tasks.max=9
{code}
*Observed behavior*
* After deploying the 3 Mirror Maker instances (at the same time), replication
for 1 or both mirrors does not work
** If we scale down to a single instance of mirror maker and wait for about 5
minutes (refresh.topics.interval.seconds?) replication starts working. After
this scaling up to 3 correctly distributes the load between the deployed
instances
*Expected behavior*
* Replication should work for all configured mirrors when running in
distributed mode
* When starting multiple instances of Mirror Maker at the same time
replication should work, 1 by 1 rollout should not be required
*Additional details*
* When replication is not working, we observe that in the internal config
topics from Mirror Maker the partitions are not assigned to the tasks, eg
{{task.assigned.partitions}} are not set at all under the properties object.
*Workaround*
* As a workaround, we start Mirror Maker instances 1 by 1 with some delay
between each instance. This allows for the first instance to set-up the
configuration in the internal topics correctly. Doing this seems to ensure that
replication works as expected.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)