[ 
https://issues.apache.org/jira/browse/KAFKA-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-13141.
------------------------------------
      Reviewer: Jason Gustafson
    Resolution: Fixed

> Leader should not update follower fetch offset if diverging epoch is present
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-13141
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13141
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Jason Gustafson
>            Assignee: Rajini Sivaram
>            Priority: Blocker
>             Fix For: 3.0.0, 2.7.2, 2.8.1
>
>
> In 2.7, we began doing fetcher truncation piggybacked on the Fetch protocol 
> instead of using the old OffsetsForLeaderEpoch API. When truncation is 
> detected, we return a `divergingEpoch` field in the Fetch response, but we do 
> not set an error code. The sender is expected to check if the diverging epoch 
> is present and truncate accordingly.
> All of this works correctly in the fetcher implementation, but the problem is 
> that the logic to update the follower fetch position on the leader does not 
> take into account the diverging epoch present in the response. This means the 
> fetch offsets can be updated incorrectly, which can lead to either log 
> divergence or the loss of committed data.
> For example, we hit the following case with 3 replicas. Leader 1 is elected 
> in epoch 1 with an end offset of 100. The followers are at offset 101
> Broker 1: (Leader) Epoch 1 from offset 100
> Broker 2: (Follower) Epoch 1 from offset 101
> Broker 3: (Follower) Epoch 1 from offset 101
> Broker 1 receives fetches from 2 and 3 at offset 101. The leader detects the 
> divergence and returns a diverging epoch in the fetch state. Nevertheless, 
> the fetch positions for both followers are updated to 101 and the high 
> watermark is advanced.
> After brokers 2 and 3 had truncated to offset 100, broker 1 experienced a 
> network partition of some kind and was kicked from the ISR. This caused 
> broker 2 to get elected, which resulted in the following state at the start 
> of epoch 2.
> Broker 1: (Follower) Epoch 2 from offset 101
> Broker 2: (Leader) Epoch 2 from offset 100
> Broker 3: (Follower) Epoch 2 from offset 100
> Broker 2 was then able to write a new entry at offset 100 and the old record 
> which may have been exposed to consumers was deleted by broker 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to