[ 
https://issues.apache.org/jira/browse/KAFKA-17249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois Visconte updated KAFKA-17249:
--------------------------------------
    Affects Version/s: 3.9.0

> Failures when building remote log aux state can make the leader epoch cache 
> inconsistent
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-17249
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17249
>             Project: Kafka
>          Issue Type: Bug
>          Components: Tiered-Storage
>    Affects Versions: 3.8.0, 3.7.1, 3.9.0
>            Reporter: Kyle Phelps
>            Priority: Major
>
> When a follower has to `buildRemoteLogAuxState` it truncates the local log. 
> Then it attempts to rebuild the epoch cache from the checkpoint in remote 
> storage. However, if this fails and the broker is restarted, the cache is 
> missing entries associated with remote segments.
> Reproduction steps:
>  # Take an existing tiered storage partition - move the latest index file 
> from remote storage so it will be inaccessible.
>  # Stop one of the follower brokers, delete the partition's local data.
>  # Restart the follower - it should be failing to build aux state.
>  # Restart the follower again. Since the log's offsets have been updated, it 
> can now successfully fetch and join the ISR.
>  # Promote the follower to the leader.
> In this scenario the leader becomes unable to serve tiered fetch requests. 
> I _think_ the root of the problem here is that the leader epoch cache isn't 
> recovering the epoch data for remote segments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to