[jira] [Commented] (ARTEMIS-5140) Poisonous message in $.artemis.internal queue causes high resource usage on target redistribution node in cluster

Jean-Pascal Briquet (Jira) Thu, 31 Oct 2024 09:18:05 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17894638#comment-17894638
 ]


Jean-Pascal Briquet commented on ARTEMIS-5140:
----------------------------------------------

Yes, you are right, it looks like the same and there is apparently already a 
PR, that is a good news.

I'll close this one as duplicated

 

 

> Poisonous message in $.artemis.internal queue causes high resource usage on 
> target redistribution node in cluster
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-5140
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-5140
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker, Clustering
>    Affects Versions: 2.36.0
>            Reporter: Jean-Pascal Briquet
>            Priority: Major
>         Attachments: message-redistribution-failing-in-loop.log, 
> messages-accumulated-in-notif-queues.png, notif-queue-created-in-loop.log, 
> notif-queues-growing.png
>
>
> *Configuration:*
> A cluster of three nodes A,B,C with message redistribution enabled.
> *Description:* 
> When the cluster connectivity is started, each Artemis node creates a 
> $.artemis.internal queue for each other nodes in the cluster.
> Message pending redistribution are moved in these queues by Artemis.
> On node C, if a poisonous (non-forwardable) message is added or moved to a 
> "$.artemis.internal" queue, it leads to:
>  * the cluster connection bridge attempts to process the message
>  * the bridge fails at the beforeForward step, as message lacks essential 
> properties for the message redistribution (no queue IDs), resulting in an 
> exception
>  * cluster connection and consumers are immediately closed
>  * one second later, the cluster connection and consumers are re-created, 
> which triggers the creation of a "notif.*" queue on node B
> This sequence happens in loop and causes continuous high CPU and disk usage 
> on node B, as the "activemq.notification" address keeps accumulating messages 
> in "notif.*" queues.
> A potential protection mechanism could be implemented to move poisonous 
> messages back to their original queue (if identifiable in message properties)
> Or, if this is not possible, the invalid message could be moved to a 
> dead-letter queue.
> *Note:*
> Originally, the problem was initially seen when an operator moved a message 
> stuck in a "duplicated" internal queue into the standard internal queue to 
> start its redistribution.
> Screenshots and related logs are provided in attachment.
>  
> *Reproduction:*
> To reproduce, simply put or move a message into a $.artemis.internal queue.
> This triggers the reconnection loop almost instantly on the node where the 
> message was injected.
> Resource usage on the nodeId targeted by the $.artemis.internal queue rapidly 
> increase as more and more "notif.*" queues are being created.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact

[jira] [Commented] (ARTEMIS-5140) Poisonous message in $.artemis.internal queue causes high resource usage on target redistribution node in cluster

Reply via email to