Alan Protasio created AMQ-7091:
----------------------------------
Summary: O(n) Memory consumption when broker has inactive durable
subscribes causing OOM
Key: AMQ-7091
URL: https://issues.apache.org/jira/browse/AMQ-7091
Project: ActiveMQ
Issue Type: Bug
Components: KahaDB
Affects Versions: 5.15.7
Reporter: Alan Protasio
Attachments: memoryAllocation.jpg
Hi :D
One of our brokers was bouncing indefinitely due OOM even though the load (TPS)
was pretty low.
Getting the memory dump I could see that almost 90% of the memory was being
used by
[messageReferences|https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2368]
TreeMap to keep track of what messages were already acked by all Subscribes in
order to delete them.
This seems to be a problem as if the broker has an inactive durable subscribe,
the memory footprint increase proportionally (O) with the number of messages
sent to the topic in question, causing the broker to die due OOM sooner or
later (the high memory footprint continue even after a restart).
You can find attached (memoryAllocation.jpg) a screen shot showing my broker
using 90% of the memory to keep track of those messages, making it barely
usable.
Looking at the code, I could do a change to change the messageReferences to use
a BTreeIndex:
final TreeMap<Long, Long> messageReferences = new TreeMap<>();
+ BTreeIndex<Long, Long> messageReferences;
Making this change, the memory allocation of the broker stabilized and the
broker didn't run OOM anymore.
Attached you can see the code that I used to reproduce this scenario, also the
memory utilization (HEAP and GC graphs) before and after this change.
Before the change the broker died in 5 minutes and I could send 480000. After
then change the broker was still pretty healthy after 5 minutes and i could
send 2265000 to the topic (almost 5x more due high GC pauses).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)