junrao commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-681109748
@chia7712 : Thanks for the reply. I like your overall idea and I think it can be used to solve the problem completely in a simpler way. 1. Instead of at `Partition`, we collect all pending delayed check operations in a queue in ReplicaManager. All callers to ReplicaManager.appendRecords() are expected to take up to 1 item from that queue and check the completeness for all affected partitions, without holding any conflicting locks. 2. Most callers to ReplicaManager.appendRecords() are from KafkaApis. We can just add the logic to check the ReplicaManager queue at the end of KafkaApis.handle(), at which point, no conflicting locks will be held. 3. Another potentially caller is the expiration thread in a purgatory. SystemTimer always runs the expiration logic in a separate thread and DelayedOperation.onExpiration() is always called without holding any conflicting lock. So, for those delayed operations using ReplicaManager.appendRecords(), we can pass down a flag to DelayedOperation so that at the end of onExpiration, we check the ReplicaManager queue if the flag is set. 4. We keep `DelayedJoin` as it is, still passing in the group lock to DelayedOperation to avoid deadlocks due to two levels of locking. 5. We can still get rid of the `tryLock` logic in DelayedOperation for simplification since there is no opportunity for deadlock. What do you think? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org