Jun Rao created KAFKA-7837:
------------------------------
Summary: maybeShrinkIsr may not reflect OfflinePartitions
immediately
Key: KAFKA-7837
URL: https://issues.apache.org/jira/browse/KAFKA-7837
Project: Kafka
Issue Type: Improvement
Reporter: Jun Rao
When a partition is marked offline due to a failed disk, the leader is supposed
to not shrink its ISR any more. In ReplicaManager.maybeShrinkIsr(), we iterate
through all non-offline partitions to shrink the ISR. If an ISR needs to
shrink, we need to write the new ISR to ZK, which can take a bit of time. In
this window, some partitions could now be marked as offline, but may not be
picked up by the iterator since it only reflects the state at that point. This
can cause all in-sync followers to be dropped out of ISR unnecessarily and
prevents a clean leader election.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)