Shirsh Kumar created ARTEMIS-4947:
-------------------------------------
Summary: Out of Memory on too many connecting/disconnecting clients
Key: ARTEMIS-4947
URL: https://issues.apache.org/jira/browse/ARTEMIS-4947
Project: ActiveMQ Artemis
Issue Type: Bug
Reporter: Shirsh Kumar
Attachments: Histogram_otherObjects.png,
NettyAcceptor_ConnectionsAllowed_2000.png, QueueImpl_Objects.png, broker.xml
I was trying to use Artemis ActiveMQ broker with Kubernetes using Artemis
cloud's
[activemq-artemis-operator|https://github.com/artemiscloud/activemq-artemis-operator]
*Usecase* (Short Lived Clients): It consists of clients connecting to broker,
subscribing to topics and disconnecting after 1 or 2 minutes.
The cluster uses the [Artemis Kubernetes image 1.0.28
|https://github.com/artemiscloud/activemq-artemis-broker-kubernetes-image]with
underlying [Artemis broker version
2.35.0|https://activemq.apache.org/components/artemis/download/release-notes-2.35.0]
Kubernetes Pods config:
CPU: 1
Memory: 2 Gi
Max ConnectionsAllowed: 2000
So, to test the stability of the system a load test was done on cluster with
initial 2 pods and scaling allowed till 3 pods.
I assumed restriction to keep the system stable by setting ConnectionsAllowed
to a value of 2000 should work.
Persistence had to be disabled as I was getting 5000ms timeout while writing
session state to disk (Attached broker.xml).
*Load test:* 7000 new client connections per minute (Client ID's = Some Prefix
+ Epoch)
(Clients connect & subscribe and then unsubscribe & disconnect after 1 minute).
So, on testing the cluster with this setup I was able to see that for initial
duration of 30 mins to 1 hr., the broker pods run fine.
After some time, it is observed that *out of memory* in the pods occurs after
which they restart.
I have captured heap dump for various stages of the test:
1. Initial Heap Dump: 10 clients connect and then disconnect after 1 minute
Dump size: 63 MB
2. Heap Dump at 83 % memory usage after 15-20 minutes
Dump size: 370 MB
3. Heap dump after OOM
Dump size: 2.1 GB
>From my observations on analyzing the heap dumps (do not have deep knowledge
>of Artemis), I could see the below:
1. The *QueueImpl has retained HEAP of 600 MB* which is taking up 48 percent
bytes.
(Attached QueueImpl_Objects.png)
2. There is a *mismatch* between the number of NettyServerConnections object's
connectedClients table and the number of session states stored in MQTT
(Attached NettyAcceptor_ConnectionsAllowed_2000.png &
Histogram_otherObjects.png)
3. The thread overview at the time of dump has one active thread which is
trying to removeSubscriptions as below:
{code:java}
Thread-4
(ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@3dad535f)
Status --> alive, Runnable
Other all threads at this point are --> alive, blocked on monitor enter.
-------------> Thread stack ----------->
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1 @ 0xb56ff6e0 :
Thread-4
(ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@3dad535f)
at
java.util.concurrent.ConcurrentHashMap.forEach(Ljava/util/function/BiConsumer;)V
(ConcurrentHashMap.java:1603)
at
org.apache.activemq.artemis.core.postoffice.impl.SimpleAddressManager.getDirectBindings(Lorg/apache/activemq/artemis/api/core/SimpleString;)Ljava/util/Collection;
(SimpleAddressManager.java:165)
at
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.getDirectBindings(Lorg/apache/activemq/artemis/api/core/SimpleString;)Ljava/util/Collection;
(PostOfficeImpl.java:1097)
at
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.removeAddressInfo(Lorg/apache/activemq/artemis/api/core/SimpleString;Z)Lorg/apache/activemq/artemis/core/server/impl/AddressInfo;
(PostOfficeImpl.java:883)
at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.removeAddressInfo(Lorg/apache/activemq/artemis/api/core/SimpleString;Lorg/apache/activemq/artemis/core/security/SecurityAuth;Z)V
(ActiveMQServerImpl.java:4015)
at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.removeAddressInfo(Lorg/apache/activemq/artemis/api/core/SimpleString;Lorg/apache/activemq/artemis/core/security/SecurityAuth;)V
(ActiveMQServerImpl.java:3989)
at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.destroyQueue(Lorg/apache/activemq/artemis/api/core/SimpleString;Lorg/apache/activemq/artemis/core/security/SecurityAuth;ZZZZ)V
(ActiveMQServerImpl.java:2523)
at
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.destroyQueue(Lorg/apache/activemq/artemis/api/core/SimpleString;Lorg/apache/activemq/artemis/core/security/SecurityAuth;ZZZ)V
(ActiveMQServerImpl.java:2461)
at
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.deleteQueue(Lorg/apache/activemq/artemis/api/core/SimpleString;Z)V
(ServerSessionImpl.java:1212)
at
org.apache.activemq.artemis.core.protocol.mqtt.MQTTSubscriptionManager.removeSubscriptions(Ljava/util/List;Z)[S
(MQTTSubscriptionManager.java:271)
at
org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler.handleUnsubscribe(Lio/netty/handler/codec/mqtt/MqttUnsubscribeMessage;)V
(MQTTProtocolHandler.java:390)
at
org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler.act(Lio/netty/handler/codec/mqtt/MqttMessage;)V
(MQTTProtocolHandler.java:180)
at
org.apache.activemq.artemis.core.protocol.mqtt.MQTTProtocolHandler$$Lambda$1042+0x00007f62588397b0.onMessage(Ljava/lang/Object;)V
()
at org.apache.activemq.artemis.utils.actors.Actor.doTask(Ljava/lang/Object;)V
(Actor.java:32)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks()V
(ProcessorBase.java:68)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$548+0x00007f62585c4d80.run()V
()
at
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
(ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run()V
(ThreadPoolExecutor.java:635)
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run()V
(ActiveMQThreadFactory.java:118) {code}
Please suggest on how to handle against barrage of incoming connections
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact