[ https://issues.apache.org/jira/browse/KAFKA-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-6880. ------------------------------------ Resolution: Fixed Fix Version/s: (was: 2.2.0) 2.1.0 This was fixed in KAFKA-7395 > Zombie replicas must be fenced > ------------------------------ > > Key: KAFKA-6880 > URL: https://issues.apache.org/jira/browse/KAFKA-6880 > Project: Kafka > Issue Type: Improvement > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Major > Labels: needs-kip > Fix For: 2.1.0 > > > Let's say we have three replicas for a partition: 1, 2 ,and 3. > In epoch 0, broker 1 is the leader and writes up to offset 50. Broker 2 > replicates up to offset 50, but broker 3 is a little behind at offset 40. The > high watermark is 40. > Suppose that broker 2 has a zk session expiration event, but fails to detect > it or fails to reestablish a session (e.g. due to a bug like KAFKA-6879), and > it continues fetching from broker 1. > For whatever reason, broker 3 is elected the leader for epoch 1 beginning at > offset 40. Broker 1 detects the leader change and truncates its log to offset > 40. Some new data is appended up to offset 60, which is fully replicated to > broker 1. Broker 2 continues fetching from broker 1 at offset 50, but gets > NOT_LEADER_FOR_PARTITION errors, which is retriable and hence broker 2 will > retry. > After some time, broker 1 becomes the leader again for epoch 2. Broker 1 > begins at offset 60. Broker 2 has not exhausted retries and is now able to > fetch at offset 50 and append the last 10 records in order to catch up. > However, because it did not observed the leader changes, it never saw the > need to truncate its log. Hence offsets 40-49 still reflect the uncommitted > changes from epoch 0. Neither KIP-101 nor KIP-279 can fix this because the > tail of the log is correct. > The basic problem is that zombie replicas are not fenced properly by the > leader epoch. We actually observed a sequence roughly like this after a > broker had partially deadlocked from KAFKA-6879. We should add the leader > epoch to fetch requests so that the leader can fence the zombie replicas. > A related problem is that we currently allow such zombie replicas to be added > to the ISR even if they are in an offline state. The problem is that the > controller will never elect them, so being part of the ISR does not give the > availability guarantee that is intended. This would also be fixed by > verifying replica leader epoch in fetch requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)