Jean-Pascal Briquet created ARTEMIS-5140:
--------------------------------------------

             Summary: Poisonous message in $.artemis.internal message causes 
high resource usage on target redistribution node in cluster
                 Key: ARTEMIS-5140
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-5140
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: Broker, Clustering
            Reporter: Jean-Pascal Briquet
         Attachments: message-redistribution-failing-in-loop.log, 
messages-accumulated-in-notif-queues.png, notif-queue-created-in-loop.log, 
notif-queues-growing.png

*Configuration:*
A cluster of three nodes A,B,C with message redistribution enabled.

*Description:* 
When the cluster connectivity is started, each Artemis node creates a 
$.artemis.internal queue for each other nodes in the cluster.
Message pending redistribution are moved in these queues by Artemis.

On node C, if a poisonous (non-forwardable) message is added or moved to a 
"$.artemis.internal" queue, it leads to:
 * the cluster connection bridge attempts to process the message
 * the bridge fails at the beforeForward step, as message lacks essential 
properties for the message redistribution (no queue IDs), resulting in an 
exception
 * cluster connection and consumers are immediately closed
 * one second later, the cluster connection and consumers are re-created, which 
triggers the creation of a "notif.*" queue on node B

This sequence happens in loop and causes continuous high CPU and disk usage on 
node B, as the "activemq.notification" address keeps accumulating messages in 
"notif.*" queues.

A potential protection mechanism could be implemented to move poisonous 
messages back to their original queue (if identifiable in message properties)
Or, if this is not possible, the invalid message could be moved to a 
dead-letter queue.

*Note:*
Originally, the problem was initially seen when an operator moved a message 
stuck in a "duplicated" internal queue into the standard internal queue to 
start its redistribution.

Screenshots and related logs are provided in attachment.

 

*Reproduction:*
To reproduce, simply put or move a message into a $.artemis.internal queue.
This triggers the reconnection loop almost instantly on the node where the 
message was injected.
Resource usage on the nodeId targeted by the $.artemis.internal queue rapidly 
increase as more and more "notif.*" queues are being created.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact


Reply via email to