stuart created KAFKA-12659:
------------------------------

             Summary: Mirrormaker 2 - seeking to wrong offsets on restart
                 Key: KAFKA-12659
                 URL: https://issues.apache.org/jira/browse/KAFKA-12659
             Project: Kafka
          Issue Type: Bug
          Components: mirrormaker
    Affects Versions: 2.7.0
         Environment: Docker container based on openjdk11:alpine-slim , running 
on Amazon ECS
            Reporter: stuart
         Attachments: partitions.png

We are running a dedicated mirror maker 2 cluster with three tasks, and have 
been trialing it for a few weeks on a single topic. It's been going fine, so we 
attempted to add a second topic, changing the MM2 config file from 

topics = sports

to 

topics = sports|translations 

 

We noticed the following day that the replication of the new topic was not 
working, and reading online it seems others have had similar issues, perhaps 
related to the config stored in the internal mm2-configs topic not refreshing 
from the file, so following  recommendations in that thread we stopped the 
tasks for 10 minutes, and eventually it started replicating.

However we also noticed later that MM2 had started re-replicating about 5 
million records from earlier that day (from the original topic) which was 
concerning. A few hours later I restarted the MM2 tasks and the same thing 
happened, it started re-replicating the same old messages.

Looking into the mm2-offsets-\{source}.internal topic I could see that the 
records which track offsets switched partitions, for example the records for 
sports-7 topic-partition went from being written to partition 5 (in 
mm2-offsets) to partition 8. The same occurred for other partitions (most but 
not all)

Following the task restarts in the MM2 logs I can see that MM2 is always 
Seeking to offset 42741034 for sports-7, this value matches the oldest offset 
record on mm2-offsets partition 5, so it looks like MM2 is ignoring the more 
recent offset records on partition 8 and so not seeking to the correct latest 
offsets.

And this also appears to affect compaction of the offsets internal topic, as 
while the older records on partition 8 for the sports-7 key are being cleaned 
up, the even older records for that same key on partition 5 are not.

I cant be certain that introducing the second topic into MM2 config was the 
trigger for that partitioning behaviour change, I am not sure why it would 
unless adding more topics to the topic replication list caused MM2 to 
automatically scale the number of partitions on the 
mm2-offsets-\{source}.internal topic, which I guess might affect partitioning 
behaviour. It was the only noteworthy thing that we consciously changed within 
the same rough timeframe however.

Attached is a screenshot to try and help illustrate the issue

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to