[ 
https://issues.apache.org/jira/browse/KAFKA-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-6880.
------------------------------------
       Resolution: Fixed
    Fix Version/s:     (was: 2.2.0)
                   2.1.0

This was fixed in KAFKA-7395

> Zombie replicas must be fenced
> ------------------------------
>
>                 Key: KAFKA-6880
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6880
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>              Labels: needs-kip
>             Fix For: 2.1.0
>
>
> Let's say we have three replicas for a partition: 1, 2 ,and 3.
> In epoch 0, broker 1 is the leader and writes up to offset 50. Broker 2 
> replicates up to offset 50, but broker 3 is a little behind at offset 40. The 
> high watermark is 40. 
> Suppose that broker 2 has a zk session expiration event, but fails to detect 
> it or fails to reestablish a session (e.g. due to a bug like KAFKA-6879), and 
> it continues fetching from broker 1.
> For whatever reason, broker 3 is elected the leader for epoch 1 beginning at 
> offset 40. Broker 1 detects the leader change and truncates its log to offset 
> 40. Some new data is appended up to offset 60, which is fully replicated to 
> broker 1. Broker 2 continues fetching from broker 1 at offset 50, but gets 
> NOT_LEADER_FOR_PARTITION errors, which is retriable and hence broker 2 will 
> retry.
> After some time, broker 1 becomes the leader again for epoch 2. Broker 1 
> begins at offset 60. Broker 2 has not exhausted retries and is now able to 
> fetch at offset 50 and append the last 10 records in order to catch up. 
> However, because it did not observed the leader changes, it never saw the 
> need to truncate its log. Hence offsets 40-49 still reflect the uncommitted 
> changes from epoch 0. Neither KIP-101 nor KIP-279 can fix this because the 
> tail of the log is correct.
> The basic problem is that zombie replicas are not fenced properly by the 
> leader epoch. We actually observed a sequence roughly like this after a 
> broker had partially deadlocked from KAFKA-6879. We should add the leader 
> epoch to fetch requests so that the leader can fence the zombie replicas.
> A related problem is that we currently allow such zombie replicas to be added 
> to the ISR even if they are in an offline state. The problem is that the 
> controller will never elect them, so being part of the ISR does not give the 
> availability guarantee that is intended. This would also be fixed by 
> verifying replica leader epoch in fetch requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to