Jean-Pascal Briquet created ARTEMIS-5446:
--------------------------------------------
Summary: Memory leak on Artemis backup node with HA replication
policy and Zookeeper quorum
Key: ARTEMIS-5446
URL: https://issues.apache.org/jira/browse/ARTEMIS-5446
Project: ActiveMQ Artemis
Issue Type: Bug
Components: Broker
Affects Versions: 2.36.0
Reporter: Jean-Pascal Briquet
Attachments: image-2025-04-24-11-16-45-740.png,
image-2025-04-24-11-20-25-598.png, image-2025-04-24-11-24-45-343.png
*Description:*
Backup nodes may encounter OOM errors when the primary become unavailable and
the backup transitions to live.
This issue impacts HA, as the backup node may become unresponsive and block
until the node is restarted by an operator.
*Analysis:*
Upon analyzing a heap dump of a backup node, it appears that instances of
object of type PostOfficeImpl accumulate on the heap each time the primary node
is restarted.
These PostOfficeImpl objects (and related objects like QueueImpl, DivertImpl,
ClusterConnectionImpl, ...) are not removed by the GC.
In large configuration (1500 queues or more), it can fill up the heap memory
quickly.
I tried to traceback the source of the problem, but it goes a bit far into
Artemis internals.
Once this state is reached, OOM errors happens randomly in various stack traces:
{code:java}
Caused by: java.lang.OutOfMemoryError: Java heap space{code}
*Reproduction Scenario:*
* Start a primary/backup pair.
* Primary node is live and the backup node is synchronized
* Capture a JVM heap dump (at this stage only one single PostOfficeImpl
instance exists on the heap)
Repeat the following steps multiple times:
* Stop the primary node
* Wait for the backup to become live
* Start the primary node
* The backup give back the lead to the primary
* Wait for the primary to become live
After several cycles, perform a last JVM heap dump. You will observe multiple
PostOfficeImpl instances lingering in the heap.
*Example:*
!image-2025-04-24-11-16-45-740.png|width=503,height=175!
Another example with a high number of Queues in configuration (see retained
heap size).
!image-2025-04-24-11-20-25-598.png|width=749,height=214!
Each instance of PostOfficeIml has a retained size of 970MB
!image-2025-04-24-11-24-45-343.png|width=557,height=256!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact