junrao commented on pull request #6915: URL: https://github.com/apache/kafka/pull/6915#issuecomment-626423986
@chia7712 : Yes, using a separate thread pool for checking completeness in Purgatory is another option. It adds its own complexity though: (1) we have to size and monitor the pool properly; (2) we have to decide whether to block the call (to create back pressure) if the pool is full; (3) we probably have to factor its usage into quota. So, we probably should only do this if it's truly needed. To me, the only problem we now have is the replicaManager.appendRecords() call from GroupMetadataManager.appendForGroup(). This is the only place that the caller can hold a different lock from the one in the callback. Solving this problem may not be that hard. We can probably just add a flag in replicaManager.appendRecords() to disable the calling of Partition.tryCompleteDelayedRequests(). Then GroupMetadataManager can make a separate tryCompleteDelayedRequests() call without the group lock. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org