[jira] [Resolved] (ARTEMIS-5027) Bug Report: Memory Leak in Artemis MQ when spokes disconnect

Justin Bertram (Jira) Wed, 04 Sep 2024 08:17:59 -0700


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Justin Bertram resolved ARTEMIS-5027.
-------------------------------------
    Resolution: Duplicate

> Bug Report: Memory Leak in Artemis MQ when spokes disconnect
> ------------------------------------------------------------
>
>                 Key: ARTEMIS-5027
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-5027
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.37.0
>         Environment: Oracle Linux Server release 9.4, 4CPU cores, 16GB of RAM 
> (max JVM 10GB)
>            Reporter: Dragan Jankovic
>            Priority: Major
>         Attachments: G1oldGen.png, Heamspace.png, JConsole.png
>
>
> *Environment Details:*
>  * *Setup:*
>  * *Artemis Broker:* Version 2.37
> *Issue Description:* The setup is a hub-spokes layout with one central 
> Artemis (hub) and many Artemis brokers connecting to it (spokes). The brokers 
> are connected using core bridges between queues on the spokes and queues on 
> the hub. There are 10 core bridges from spoke-to-hub and 10 core bridges from 
> hub-to-spoke, totalling in 20 connections per spoke. There are 200 spokes in 
> this test.
> When an Artemis spoke broker (the Artemis broker making connections to the 
> monitored Artemis broker) is either forcibly terminated (killed) or 
> gracefully stopped and then started again, we observe a significant increase 
> in memory usage within the hub Artemis broker. The memory consumption 
> increases by approximately 200MB per restarted spoke broker. This indicates a 
> resource/memory leak.
> *Fault scenario:* After the spoke broker is restarted, the memory allocated 
> by the hub Artemis broker continues to grow without being released. This 
> increase in memory usage persists, potentially leading to memory exhaustion 
> over time, which could destabilize the entire system. The heap dump suggests 
> that the resource leak happens around the connections initiated from 
> hub-to-spoke direction, but this needs proving.
> *Technical Details:*
>  * *Observations:*
>  * A heap memory dump was taken and analyzed.
>  * The issue appears to originate from the 
> org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl class 
> within the Artemis broker codebase.
>  * This class seems to fail to release resources properly when the client 
> broker is terminated, likely due to unreleased connections or buffers.
> *Affected version:*
>  * The issue is present in *Artemis 2.37* version
> *Steps to Reproduce:*
>  # Start Artemis spoke brokers and a hub Artemis broker using the specified 
> versions.
>  # Wait for them to establish all the core bridge connections.
>  # Forcefully terminate (kill) or gracefully stop the Artemis spoke broker.
>  # Start the spoke broker again and see it re-establish the connections.
>  # Monitor the memory usage of the hub Artemis broker over time.
>  # Observe the continuous increase in memory usage
> *Additional Information:*
> We have created a memory dump from such a hub broker with around 450 spokes 
> after exhausting about 5GB of heap.
>  * *Memory Dump Report:*
>  * 144,733 instances of 
> org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl, loaded 
> by java.net.URLClassLoader @ 0x6c81acd70, occupy 4,535,785,712 (85.38%) bytes.
>  * Most of these instances are referenced from one instance of 
> java.util.HashMap$Node[], loaded by <system class loader>, which occupies 
> 141,584 (0.00%) bytes. This instance is referenced by 
> org.apache.activemq.artemis.core.server.cluster.ClusterManager @ 0x6c1ed4b60, 
> loaded by java.net.URLClassLoader @ 0x6c81acd70.
>  * The thread 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread
>  @ 0x6c2c1c340 activemq-failure-check-thread has a local variable or 
> reference to 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl @ 
> 0x6c2c1c910, which is on the shortest path to java.util.HashMap$Node[8192] @ 
> 0x710f30780.
>  * The thread 
> org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl$FailureCheckAndFlushThread
>  @ 0x6c2c1c340 activemq-failure-check-thread keeps local variables with a 
> total size of 960 (0.00%) bytes.
>  * The stack trace of this thread is available and includes details of 
> involved local variables.
> *Heap dump usage:*
> The increase in heap memory is marked by rectangles in the attached pictures.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact

[jira] [Resolved] (ARTEMIS-5027) Bug Report: Memory Leak in Artemis MQ when spokes disconnect

Reply via email to