[ https://issues.apache.org/jira/browse/KAFKA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-7128. ------------------------------------ Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 1.1.2 > Lagging high watermark can lead to committed data loss after ISR expansion > -------------------------------------------------------------------------- > > Key: KAFKA-7128 > URL: https://issues.apache.org/jira/browse/KAFKA-7128 > Project: Kafka > Issue Type: Bug > Reporter: Jason Gustafson > Assignee: Anna Povzner > Priority: Major > Fix For: 1.1.2, 2.0.1, 2.1.0 > > > Some model checking exposed a weakness in the ISR expansion logic. We know > that the high watermark can go backwards after a leader failover, but we may > not have known that this can lead to the loss of committed data. > Say we have three replicas: r1, r2, and r3. Initially, the ISR consists of > (r1, r2) and the leader is r1. r3 is a new replica which has not begun > fetching. The data up to offset 10 has been committed to the ISR. Here is the > initial state: > State 1 > ISR: (r1, r2) > Leader: r1 > r1: [hw=10, leo=10] > r2: [hw=5, leo=10] > r3: [hw=0, leo=0] > Replica 1 then initiates shutdown (or fails) and leaves the ISR, which makes > r2 the new leader. The high watermark is still lagging r1. > State 2 > ISR: (r2) > Leader: r2 > r1 (offline): [hw=10, leo=10] > r2: [hw=5, leo=10] > r3: [hw=0, leo=0] > Replica 3 then catch up to the high watermark on r2 and joins the ISR. > Perhaps it's high watermark is lagging behind r2, but this is unimportant. > State 3 > ISR: (r2, r3) > Leader: r2 > r1 (offline): [hw=10, leo=10] > r2: [hw=5, leo=10] > r3: [hw=0, leo=5] > Now r2 fails and r3 is elected leader and is the only member of the ISR. The > committed data from offsets 5 to 10 has been lost. > State 4 > ISR: (r3) > Leader: r3 > r1 (offline): [hw=10, leo=10] > r2 (offline): [hw=5, leo=10] > r3: [hw=0, leo=5] > The bug is the fact that we allowed r3 into the ISR after the local high > watermark had been reached. Since the follower does not know the true high > watermark for the previous leader's epoch, it should not allow a replica to > join the ISR until it has caught up to an offset within its own epoch. > Note this is related to > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-207%3A+Offsets+returned+by+ListOffsetsResponse+should+be+monotonically+increasing+even+during+a+partition+leader+change] -- This message was sent by Atlassian JIRA (v7.6.3#76005)