[
https://issues.apache.org/jira/browse/KAFKA-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765215#comment-16765215
]
ASF GitHub Bot commented on KAFKA-7897:
---------------------------------------
hachikuji commented on pull request #6253: KAFKA-7897; Do not write epoch start
offset for older message format versions
URL: https://github.com/apache/kafka/pull/6253
When an older message format is in use, we should disable the leader epoch
cache so that we resort to truncation by high watermark. Previously we updated
the cache for all versions when a broker became leader for a partition. This
can cause large and unnecessary truncations after leader changes because we
relied on the presence of _any_ cached epoch in order to tell whether to use
the improved truncation logic possible with the OffsetsForLeaderEpoch API.
Note this is a simplified fix than what was merged to trunk in #6232 since
the branches have diverged significantly. Rather than removing the epoch cache
file, we guard usage of the cache with the record version.
### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Invalid use of epoch cache with old message format versions
> -----------------------------------------------------------
>
> Key: KAFKA-7897
> URL: https://issues.apache.org/jira/browse/KAFKA-7897
> Project: Kafka
> Issue Type: Bug
> Reporter: Jason Gustafson
> Assignee: Jason Gustafson
> Priority: Major
>
> Message format downgrades are not supported, but they generally work as long
> as broker/clients at least can continue to parse both message formats. After
> a downgrade, the truncation logic should revert to using the high watermark,
> but currently we use the existence of any cached epoch as the sole
> prerequisite in order to leverage OffsetsForLeaderEpoch. This has the effect
> of causing a massive truncation after startup which causes re-replication.
> I think our options to fix this are to either 1) clear the cache when we
> notice a downgrade, or 2) forbid downgrades and raise an error.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)