[
https://issues.apache.org/jira/browse/AMQ-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14333650#comment-14333650
]
Kevin Burton commented on AMQ-5609:
-----------------------------------
I've implemented a fix for this and will release it today.
My fix was per discussion on the list.
The design takes RegionBroker and the inactiveDestinationsPurgeLock and breaks
it up by granularity based on the ActiveMQDestination.
This way two destinations with the same physical name don't acquire the same
lock. This makes purge WAY more concurrent.
I've implemented a ChunkedGranularReentrantReadWriteLock which internally uses
1024 ReentrantReadWriteLocks so there's a 1/1024 chance that a given queue will
use the same lock as another queue.
This isn't perfect but in practice this should be MORE than enough concurrency
to solve the 80% solution.
A more correct model would be to acquire a lock per queue name but this opens
up some complex concurrency issues regarding cleaning up
ReentrantReadWriteLocks for queues as they disappear. For now we're using a
simple interface so if someone wants to improve upon this in the future they
can just change the implementation.
> ActiveMQ locks out new consumers/producers during queue GC
> ----------------------------------------------------------
>
> Key: AMQ-5609
> URL: https://issues.apache.org/jira/browse/AMQ-5609
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.10.1, 5.11.0
> Reporter: Kevin Burton
>
> ActiveMQ supports a feature where it can GC a queue that is inactive. IE now
> messages and no consumers.
> However, there’s a bug where
> purgeInactiveDestinations
> in
> org.apache.activemq.broker.region.RegionBroker
> creates a read/write lock (inactiveDestinationsPurgeLock) which is held
> during the entire queue GC.
> each individual queue GC takes about 100ms with a disk backed queue and 10ms
> with a memory backed (non-persistent) queue. If you have thousands of them to
> GC at once the inactiveDestinationsPurgeLock lock is held the entire time
> which can last from 60 seconds to 5 minutes (and essentially unbounded).
> A read lock is also held for this in addConsumer addProducer so that when a
> new consumer or produce tries to connect, they’re blocked until queue GC
> completes.
> Existing producers/consumers work JUST fine.
> The lock MUST be held on each queue because if it isn’t there’s a race where
> a queue is flagged to be GCd , then a producer comes in and writes a new
> message, then the background thread deletes the queue which it marked as
> GCable but it had the newly produced message. This would result in data loss.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)