[ https://issues.apache.org/jira/browse/KAFKA-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dong Lin updated KAFKA-4485: ---------------------------- Description: As of current implementation, we will exclude follower from ISR if the begin offset of FetchRequest from this follower is always smaller than logEndOffset of leader for more than replicaLagTimeMaxMs. Also, we will add a follower to ISR if the beginOffset of FetchRequest from this follower is equal or larger than high watermark of this partition. This is problematic for the following reasons: 1) The criteria for ISR is inconsistent between maybeExpandIsr() and maybeShrinkIsr(). A follower may be repeatedly remove and added to the ISR (e.g. in the scenario described below). 2) A follower may be removed from the ISR even if its fetch rate can keep up with produce rate. Suppose a produce keeps producing a lot of small requests at high request rate but low byte rate (e.g. many mirror makers), and the follower is always able to read all the available data at the time leader receives it. However, the begin offset of fetch request will always be smaller than logEndOffset of leader. Thus the follower will be removed from ISR after replicaLagTimeMaxMs. The solution to the problem is the following: A follower should be in ISR if begin offset of its FetchRequest >= max(high watermark of partition, log end offset of leader at the time the leader receives the previous FetchRequest). The follower should be removed from ISR if this criteria is not met for more than replicaLagTimeMaxMs. Note that we are comparing begin offset of FetchRequest with log end offset of leader at the time the leader receives the previous FetchRequest as an approximate way to compare the end offset of fetched data with log end offset of leader. This is because we can not easily know the end offset of fetched data at the time broker receives fetch request. This solution makes the following guarantee: 1) If a follower is in ISR, then its log end offset >= high watermark of partition at least sometime in the last replicaLagTimeMaxMs. 2) If a follower is not in ISR, then the end offset of its FetchRequest can not catch up with log end offset of leader for more than replicaLagTimeMaxMs. Either follower is in bootstrap phase, or the follower's average fetch rate is smaller than average produce rate into the partition for the last replicaLagTimeMaxMs. was: As of current implementation, we will exclude follower from ISR if the begin offset of FetchRequest from this follower is always smaller than logEndOffset of leader for more than replicaLagTimeMaxMs. Also, we will add a follower to ISR if the beginOffset of FetchRequest from this follower is equal or larger than high watermark of this partition. This is problematic for the following reasons: 1) The criteria for ISR is inconsistent between maybeExpandIsr() and maybeShrinkIsr(). A follower may be repeatedly remove and added to the ISR (e.g. in the scenario described below). 2) A follower may be removed from the ISR even if its fetch rate can keep up with produce rate. Suppose a produce keeps producing a lot of small requests at high request but low byte rate, the fetch request is always able to read all the available data at the time leader receives it. However, the begin offset of fetch request will always be smaller than logEndOffset of leader. Thus the follower will be removed from ISR. The solution to the problem is the following: A follower should be in ISR if begin offset of its FetchRequest >= max(high watermark of partition, log end offset of leader at the time the leader receives the previous FetchRequest). The follower should be removed from ISR if this criteria is not met for more than replicaLagTimeMaxMs. This solution makes the following guarantee: 1) If a follower is in ISR, then its log end offset >= high watermark of partition at least sometime in the last replicaLagTimeMaxMs. 2) If a follower is not in ISR, then the end offset of its FetchRequest can not catch up with log end offset of leader for more than replicaLagTimeMaxMs. Either follower is in bootstrap phase, or the follower's average fetch rate < produce rate into the partition for more than replicaLagTimeMaxMs. > Follower should be in the isr if its FetchRequest has fetched up to the > logEndOffset of leader > ---------------------------------------------------------------------------------------------- > > Key: KAFKA-4485 > URL: https://issues.apache.org/jira/browse/KAFKA-4485 > Project: Kafka > Issue Type: Bug > Reporter: Dong Lin > Assignee: Dong Lin > > As of current implementation, we will exclude follower from ISR if the begin > offset of FetchRequest from this follower is always smaller than logEndOffset > of leader for more than replicaLagTimeMaxMs. > Also, we will add a follower to ISR if the beginOffset of FetchRequest from > this follower is equal or larger than high watermark of this partition. > This is problematic for the following reasons: > 1) The criteria for ISR is inconsistent between maybeExpandIsr() and > maybeShrinkIsr(). A follower may be repeatedly remove and added to the ISR > (e.g. in the scenario described below). > 2) A follower may be removed from the ISR even if its fetch rate can keep up > with produce rate. Suppose a produce keeps producing a lot of small requests > at high request rate but low byte rate (e.g. many mirror makers), and the > follower is always able to read all the available data at the time leader > receives it. However, the begin offset of fetch request will always be > smaller than logEndOffset of leader. Thus the follower will be removed from > ISR after replicaLagTimeMaxMs. > The solution to the problem is the following: > A follower should be in ISR if begin offset of its FetchRequest >= max(high > watermark of partition, log end offset of leader at the time the leader > receives the previous FetchRequest). The follower should be removed from ISR > if this criteria is not met for more than replicaLagTimeMaxMs. Note that we > are comparing begin offset of FetchRequest with log end offset of leader at > the time the leader receives the previous FetchRequest as an approximate way > to compare the end offset of fetched data with log end offset of leader. This > is because we can not easily know the end offset of fetched data at the time > broker receives fetch request. > This solution makes the following guarantee: > 1) If a follower is in ISR, then its log end offset >= high watermark of > partition at least sometime in the last replicaLagTimeMaxMs. > 2) If a follower is not in ISR, then the end offset of its FetchRequest can > not catch up with log end offset of leader for more than replicaLagTimeMaxMs. > Either follower is in bootstrap phase, or the follower's average fetch rate > is smaller than average produce rate into the partition for the last > replicaLagTimeMaxMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)