[
https://issues.apache.org/jira/browse/KAFKA-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sriharsha Chintalapani resolved KAFKA-6431.
-------------------------------------------
Resolution: Fixed
> Lock contention in Purgatory
> ----------------------------
>
> Key: KAFKA-6431
> URL: https://issues.apache.org/jira/browse/KAFKA-6431
> Project: Kafka
> Issue Type: Improvement
> Components: core, purgatory
> Reporter: Ying Zheng
> Assignee: Ying Zheng
> Priority: Minor
> Fix For: 2.2.0
>
>
> Purgatory is the data structure in Kafka broker that manages delayed
> operations. There is a ConcurrentHashMap (Kafka Pool) maps each operation key
> to the operations (in a ConcurrentLinkedQueue) that are interested in the key.
> When an operation is done or expired, it's removed from the list
> (ConcurrentLinkedQueue). When the list is empty, it's removed from the
> ConcurrentHashMap. The 2nd operation has to be protected by a lock, to avoid
> adding new operations into a list that is being removed. This is currently
> done by a globally shared ReentrantReadWriteLock. All the read operations on
> purgatory have to acquire the read permission of this lock. The list removing
> operations needs the write permission of this lock.
> Our profiling result shows that Kafka broker is spending a nontrivial amount
> of time on this read write lock.
> The problem is exacerbated when there are a large amount of short operations.
> For example, when we are doing sync produce operations (acks=all), a
> DelayedProduce operation is added and then removed for each message. If the
> QPS of the topic is not high, it's very likely that, when the operation is
> done and removed, the list of that key (topic partitions) also becomes empty,
> and has to be removed when holding the write lock. This operation blocks all
> the read / write operations on entire purgatory for awhile. As there are tens
> of IO threads accessing purgatory concurrently, this shared lock can easily
> become a bottleneck.
> Actually, we only want to avoid concurrent read / write on the same key. The
> operations on different keys do not conflict with each other.
> I suggest to shard purgatory into smaller partitions, and lock each
> individual partition independently.
> Assuming there are 10 io threads actively accessing purgatory, sharding
> purgatory into 512 partitions will make the probability for 2 or more threads
> accessing the same partition at the same time to be about 2%. We can also use
> ReentrantLock instead of ReentrantReadWriteLock. When the read operations are
> not much more than write operations, ReentrantLock has lower overhead than
> ReentrantReadWriteLock.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)