Greg Harris created KAFKA-16641:
-----------------------------------

             Summary: MM2 offset translation should interpolate between sparse 
OffsetSyncs
                 Key: KAFKA-16641
                 URL: https://issues.apache.org/jira/browse/KAFKA-16641
             Project: Kafka
          Issue Type: Improvement
          Components: mirrormaker
            Reporter: Greg Harris


Right now, the OffsetSyncStore keeps a sparse offset store, with exponential 
spacing between syncs. This can leave large gaps in translation, where offsets 
are translated much more conservatively than necessary.

The dominant way to use MirrorMaker2 is in a "single writer" fashion, where the 
target topic is only written to by a single mirror maker 2. When a topic 
without gaps is replicated, contiguous blocks of offsets are preserved. For 
example:

Say that MM2 mirrors 100 records, and emits two syncs: 0:100 and 100:200. We 
can detect when the gap between the upstream and downstream offsets is the same 
using subtraction, and then assume that 50:150 is also a valid translation. If 
the source topic has gaps, or goes through a restart, we should expect a 
discontinuity in the offset syncs, like 0:100 and 100:250 or 0:100 and 100:150.

This may allow us to restore much of the offset translation precision that was 
lost for simple contiguous topics, without additional memory usage, but at the 
risk of mis-translating some pathological situations when the source topic has 
gaps. This might be able to be enabled unconditionally, or enabled via a 
configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to