[
https://issues.apache.org/activemq/browse/AMQ-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dejan Bosanac resolved AMQ-3028.
--------------------------------
Resolution: Fixed
Fix Version/s: 5.5.0
Assignee: Dejan Bosanac
Fixed with svn revision 1038566
I didn't make LRU cache synced in general, just synced the usage of pageCache.
Let us know if it helps with your scenario.
> ActiveMQ broker processing slows with consumption from large store
> ------------------------------------------------------------------
>
> Key: AMQ-3028
> URL: https://issues.apache.org/activemq/browse/AMQ-3028
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.4.1
> Environment: CentOS 5.5, Sun JDK 1.6.0_21-b06 64 bit, ActiveMQ 5.4.1,
> AMD Athlon(tm) II X2 B22, local disk
> Reporter: Arthur Naseef
> Assignee: Dejan Bosanac
> Priority: Critical
> Fix For: 5.5.0
>
> Attachments: LRUCache.patch
>
>
> In scalability tests, this problem occured. I have tested a workaround that
> appears to function. A fix will gladly be submitted - would like some
> guidance, though, on the most appropriate solution.
> Here's the summary. Many more details are available upon request.
> Root cause:
> - Believed to be simultaneous access to LRUCache objects which are not
> thread-safe (PageFile's pageCache)
> Workaround:
> - Synchronize the LRUCache on all access methods (get, put, remove)
> The symptoms are as follows:
> 1. Message rates run fairly-constant until a point in time when they
> degrade rather quickly
> 2. After a while (about 15 minutes), the message rates drop to the floor -
> with large numbers of seconds with 0 records passing
> 3. Using VisualVM or JConsole, note that memory use grows continuosuly
> 4. When message rates drop to the floor, the VM is spending the vast
> majority of its time performing garbage collection
> 5. Heap dumps show that LRUCache objects (the pageCache members of
> PageFile's) are far exceeding their configured limits.
> The default limit was used, 10000. A size of over 170,000 entries was
> reached.
> 6. No producer flow control occurred (did not see the flow control log
> message)
> Test scenario used to reproduce:
> - Fast producers (limited to <= 1000 msgs/sec)
> -- using transactions
> -- 10 msg per transaction
> -- message content size 177 bytes
> - Slow consumers (limited to <= 10 msg/sec)
> -- auto-acknowledge mode; not transacted
> - 10 Queues
> -- 1 producer per queue
> -- 1 consumer per queue
> - Producers, Consumers, and Broker all running on different systems, and
> on the same system (different test runs).
> Note that disk space was not an issue - there was always plenty of disk space
> available.
> One other interesting note - once a large database of records was stored in
> KahaDB, only running consumers, this problem still occurred.
> This issue sounds like it may be related to 1764, and 2721. The root cause
> sounds the same as 2290 - unsynchronized access to LRUCache.
> The most straight-forward solution is to modify all LRUCache objects
> (org.apache.kahadb.util.LRUCache, org.apache.activemq.util.LRUCache, ...) to
> be concurrent. Another is to create concurrent versions (perhaps
> ConcurrentLRUCache) and make use of those at least in PageFile.pageCache.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.