Calvin Liu created KAFKA-15221:
----------------------------------

             Summary: Potential race condition between requests from rebooted 
followers
                 Key: KAFKA-15221
                 URL: https://issues.apache.org/jira/browse/KAFKA-15221
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 3.5.0
            Reporter: Calvin Liu
            Assignee: Calvin Liu
             Fix For: 3.6.0, 3.5.1


When the leader processes the fetch request, it does not acquire locks when 
updating the replica fetch state. Then there can be a race between the fetch 
requests from a rebooted follower.

T0, broker 1 sends a fetch to broker 0(leader). At the moment, broker 1 is not 
in ISR.

T1, broker 1 crashes.

T2 broker 1 is back online and receives a new broker epoch. Also, it sends a 
new Fetch request.

T3 broker 0 receives the old fetch requests and decides to expand the ISR.

T4 Right before broker 0 starts to fill the AlterPartitoin request, the new 
fetch request comes in and overwrites the fetch state. Then broker 0 uses the 
new broker epoch on the AlterPartition request.

In this way, the AlterPartition request can get around KIP-903 and wrongly 
update the ISR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to