[
https://issues.apache.org/jira/browse/ARTEMIS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Justin Bertram resolved ARTEMIS-5027.
-------------------------------------
Resolution: Duplicate
> Bug Report: Memory Leak in Artemis MQ when spokes disconnect
> ------------------------------------------------------------
>
> Key: ARTEMIS-5027
> URL: https://issues.apache.org/jira/browse/ARTEMIS-5027
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.37.0
> Environment: Oracle Linux Server release 9.4, 4CPU cores, 16GB of RAM
> (max JVM 10GB)
> Reporter: Dragan Jankovic
> Priority: Major
> Attachments: G1oldGen.png, Heamspace.png, JConsole.png
>
>
> *Environment Details:*
> * *Setup:*
> * *Artemis Broker:* Version 2.37
> *Issue Description:* The setup is a hub-spokes layout with one central
> Artemis (hub) and many Artemis brokers connecting to it (spokes). The brokers
> are connected using core bridges between queues on the spokes and queues on
> the hub. There are 10 core bridges from spoke-to-hub and 10 core bridges from
> hub-to-spoke, totalling in 20 connections per spoke. There are 200 spokes in
> this test.
> When an Artemis spoke broker (the Artemis broker making connections to the
> monitored Artemis broker) is either forcibly terminated (killed) or
> gracefully stopped and then started again, we observe a significant increase
> in memory usage within the hub Artemis broker. The memory consumption
> increases by approximately 200MB per restarted spoke broker. This indicates a
> resource/memory leak.
> *Fault scenario:* After the spoke broker is restarted, the memory allocated
> by the hub Artemis broker continues to grow without being released. This
> increase in memory usage persists, potentially leading to memory exhaustion
> over time, which could destabilize the entire system. The heap dump suggests
> that the resource leak happens around the connections initiated from
> hub-to-spoke direction, but this needs proving.
> *Technical Details:*
> * *Observations:*
> * A heap memory dump was taken and analyzed.
> * The issue appears to originate from the
> org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl class
> within the Artemis broker codebase.
> * This class seems to fail to release resources properly when the client
> broker is terminated, likely due to unreleased connections or buffers.
> *Affected version:*
> * The issue is present in *Artemis 2.37* version
> *Steps to Reproduce:*
> # Start Artemis spoke brokers and a hub Artemis broker using the specified
> versions.
> # Wait for them to establish all the core bridge connections.
> # Forcefully terminate (kill) or gracefully stop the Artemis spoke broker.
> # Start the spoke broker again and see it re-establish the connections.
> # Monitor the memory usage of the hub Artemis broker over time.
> # Observe the continuous increase in memory usage
> *Additional Information:*
> We have created a memory dump from such a hub broker with around 450 spokes
> after exhausting about 5GB of heap.
> * *Memory Dump Report:*
> * 144,733 instances of
> org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl, loaded
> by java.net.URLClassLoader @ 0x6c81acd70, occupy 4,535,785,712 (85.38%) bytes.
> * Most of these instances are referenced from one instance of
> java.util.HashMap$Node[], loaded by <system class loader>, which occupies
> 141,584 (0.00%) bytes. This instance is referenced by
> org.apache.activemq.artemis.core.server.cluster.ClusterManager @ 0x6c1ed4b60,
> loaded by java.net.URLClassLoader @ 0x6c81acd70.
> * The thread
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread
> @ 0x6c2c1c340 activemq-failure-check-thread has a local variable or
> reference to
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl @
> 0x6c2c1c910, which is on the shortest path to java.util.HashMap$Node[8192] @
> 0x710f30780.
> * The thread
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread
> @ 0x6c2c1c340 activemq-failure-check-thread keeps local variables with a
> total size of 960 (0.00%) bytes.
> * The stack trace of this thread is available and includes details of
> involved local variables.
> *Heap dump usage:*
> The increase in heap memory is marked by rectangles in the attached pictures.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact