[
https://issues.apache.org/jira/browse/AMQ-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Protasio updated AMQ-7091:
-------------------------------
Attachment: InactiveDurableSubscriberTest.java
> O(n) Memory consumption when broker has inactive durable subscribes causing
> OOM
> -------------------------------------------------------------------------------
>
> Key: AMQ-7091
> URL: https://issues.apache.org/jira/browse/AMQ-7091
> Project: ActiveMQ
> Issue Type: Bug
> Components: KahaDB
> Affects Versions: 5.15.7
> Reporter: Alan Protasio
> Priority: Major
> Attachments: After.png, Before.png,
> InactiveDurableSubscriberTest.java, memoryAllocation.jpg
>
>
> Hi :D
> One of our brokers was bouncing indefinitely due OOM even though the load
> (TPS) was pretty low.
> Getting the memory dump I could see that almost 90% of the memory was being
> used by
> [messageReferences|https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2368]
> TreeMap to keep track of what messages were already acked by all Subscribes
> in order to delete them.
> This seems to be a problem as if the broker has an inactive durable
> subscribe, the memory footprint increase proportionally (O) with the number
> of messages sent to the topic in question, causing the broker to die due OOM
> sooner or later (the high memory footprint continue even after a restart).
> You can find attached (memoryAllocation.jpg) a screen shot showing my broker
> using 90% of the memory to keep track of those messages, making it barely
> usable.
> Looking at the code, I could do a change to change the messageReferences to
> use a BTreeIndex:
> final TreeMap<Long, Long> messageReferences = new TreeMap<>();
> + BTreeIndex<Long, Long> messageReferences;
> Making this change, the memory allocation of the broker stabilized and the
> broker didn't run OOM anymore.
> Attached you can see the code that I used to reproduce this scenario, also
> the memory utilization (HEAP and GC graphs) before and after this change.
> Before the change the broker died in 5 minutes and I could send 480000. After
> then change the broker was still pretty healthy after 5 minutes and i could
> send 2265000 to the topic (almost 5x more due high GC pauses).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)