[ https://issues.apache.org/jira/browse/KAFKA-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajini Sivaram resolved KAFKA-13141. ------------------------------------ Reviewer: Jason Gustafson Resolution: Fixed > Leader should not update follower fetch offset if diverging epoch is present > ---------------------------------------------------------------------------- > > Key: KAFKA-13141 > URL: https://issues.apache.org/jira/browse/KAFKA-13141 > Project: Kafka > Issue Type: Bug > Affects Versions: 2.8.0, 2.7.1 > Reporter: Jason Gustafson > Assignee: Rajini Sivaram > Priority: Blocker > Fix For: 3.0.0, 2.7.2, 2.8.1 > > > In 2.7, we began doing fetcher truncation piggybacked on the Fetch protocol > instead of using the old OffsetsForLeaderEpoch API. When truncation is > detected, we return a `divergingEpoch` field in the Fetch response, but we do > not set an error code. The sender is expected to check if the diverging epoch > is present and truncate accordingly. > All of this works correctly in the fetcher implementation, but the problem is > that the logic to update the follower fetch position on the leader does not > take into account the diverging epoch present in the response. This means the > fetch offsets can be updated incorrectly, which can lead to either log > divergence or the loss of committed data. > For example, we hit the following case with 3 replicas. Leader 1 is elected > in epoch 1 with an end offset of 100. The followers are at offset 101 > Broker 1: (Leader) Epoch 1 from offset 100 > Broker 2: (Follower) Epoch 1 from offset 101 > Broker 3: (Follower) Epoch 1 from offset 101 > Broker 1 receives fetches from 2 and 3 at offset 101. The leader detects the > divergence and returns a diverging epoch in the fetch state. Nevertheless, > the fetch positions for both followers are updated to 101 and the high > watermark is advanced. > After brokers 2 and 3 had truncated to offset 100, broker 1 experienced a > network partition of some kind and was kicked from the ISR. This caused > broker 2 to get elected, which resulted in the following state at the start > of epoch 2. > Broker 1: (Follower) Epoch 2 from offset 101 > Broker 2: (Leader) Epoch 2 from offset 100 > Broker 3: (Follower) Epoch 2 from offset 100 > Broker 2 was then able to write a new entry at offset 100 and the old record > which may have been exposed to consumers was deleted by broker 1. -- This message was sent by Atlassian Jira (v8.3.4#803005)