Manikumar created KAFKA-9594:
--------------------------------
Summary: speed up the processing of LeaderAndIsrRequest
Key: KAFKA-9594
URL: https://issues.apache.org/jira/browse/KAFKA-9594
Project: Kafka
Issue Type: Improvement
Reporter: Jun Rao
Assignee: Manikumar
Fix For: 2.6.0
Observations from [~junrao]
Currently, Partition.makerFollower() holds a write lock on leaderIsrUpdateLock.
Partition.doAppendRecordsToFollowerOrFutureReplica() holds a read lock on
leaderIsrUpdateLock. So, if there is an ongoing log append on the follower, the
makeFollower() call will be delayed. This path is a bit different when serving
the Partition.makeLeader() call. Before we make a call on
Partition.makerLeader(), we first remove the follower from the
replicaFetcherThread. So, the makerLeader() call won't be delayed because of
log append. This means that when we change one follower to become leader and
another follower to follow the new leader during a controlled shutdown, the
makerLeader() call typically completes faster than the makeFollower() call,
which can delay the follower fetching from the new leader and cause ISR to
shrink.
This only reason that Partition.doAppendRecordsToFollowerOrFutureReplica()
needs to hold a read lock on leaderIsrUpdateLock is for
Partiiton.maybeReplaceCurrentWithFutureReplica() to pause the log append while
checking if the log dir could be replaced. We could potentially add a separate
lock (sth like futureLogLock) that's synced between
maybeReplaceCurrentWithFutureReplica() and
doAppendRecordsToFollowerOrFutureReplica(). Then,
doAppendRecordsToFollowerOrFutureReplica() doesn't need to hold the lock on
leaderIsrUpdateLock.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)