[
https://issues.apache.org/jira/browse/ARTEMIS-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Justin Bertram updated ARTEMIS-5377:
------------------------------------
Description:
I encountered a similar issue as described on ARTEMIS-4114.
Configuration:
{code:xml}
<ha-policy>
<live-only>
<scale-down>
<connectors>
<connector-ref>ART.GW.CLS1-connector</connector-ref>
<connector-ref>ART.GW.CLS2-connector</connector-ref>
<connector-ref>ART.GW.CLS3-connector</connector-ref>
</connectors>
</scale-down>
</live-only>
</ha-policy>
...
<bridges>
<bridge name="ART.EL.CLS">
<queue-name>ART.EL.CLS</queue-name>
<forwarding-address>ART.EL.CLS</forwarding-address>
<reconnect-attempts>-1</reconnect-attempts>
<static-connectors>
<connector-ref>ART.EL.CLS1-connector</connector-ref>
<connector-ref>ART.EL.CLS2-connector</connector-ref>
<connector-ref>ART.EL.CLS3-connector</connector-ref>
<connector-ref>ART.EL.CLS4-connector</connector-ref>
<connector-ref>ART.EL.CLS5connector</connector-ref>
<connector-ref>ART.EL.CLS6-connector</connector-ref>
</static-connectors>
</bridge>
... {code}
When restarting one broker, another broker connected to it via a bridge crashes
with a deadlock error. Before the crash, the following warnings appeared on the
connected broker:
{noformat}
2025-03-08 02:39:09,089 WARN [org.apache.activemq.artemis.core.server]
AMQ222094: Bridge unable to send message
Reference[314223917984]:NON-RELIABLE:CoreMessage[...]@2117085983, will try
again once bridge reconnects
ActiveMQObjectClosedException[errorType=OBJECT_CLOSED message=AMQ219018:
Producer is closed]
at
org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.checkClosed(ClientProducerImpl.java:310)
at
org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:127)
at
org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:767)
at
org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:628)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:4055)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3191)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4380)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
2025-03-08 02:39:09,089 WARN [org.apache.activemq.artemis.core.server]
AMQ222095: Connection failed with failedOver=false
2025-03-08 02:39:09,089 WARN
[org.apache.activemq.artemis.core.server.impl.QueueImpl] null
java.util.NoSuchElementException
at
org.apache.activemq.artemis.utils.collections.PriorityLinkedListImpl$PriorityLinkedListIterator.repeat(PriorityLinkedListImpl.java:225)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3214)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4380)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118){noformat}
I have attached the full log. From it, you can see that at 02:41:21 Broker 1
completely shut down. Then, at 02:41:38 Broker 2 also stopped with the
following error:
{noformat}
2025-03-08 02:41:38,069 INFO [org.apache.activemq.artemis.core.server]
AMQ224107: The Critical Analyzer detected slow paths on the broker. It is
recommended that you enable trace logs on
org.apache.activemq.artemis.utils.critical while you troubleshoot this issue.
You should disable the trace logs when you have finished troubleshooting.
2025-03-08 02:41:38,069 ERROR [org.apache.activemq.artemis.core.server]
AMQ224079: The process for the virtual machine will be killed, as component
QueueImpl[name=ART.EL.CLS, postOffice=PostOfficeImpl
[server=ActiveMQServerImpl::name=ART.GW.CLS7], temp=false]@3789b8fe is not
responsive
2025-03-08 02:41:40,152 WARN [org.apache.activemq.artemis.core.server]
AMQ222199: Thread dump:
*******************************************************************************
Complete Thread dump
"Reference Handler" Id=2 RUNNABLE
at [email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native
Method)
at
[email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:241)
at
[email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:213)
"Finalizer" Id=3 WAITING on java.lang.ref.ReferenceQueue$Lock@279208e0
at [email protected]/java.lang.Object.wait(Native Method)
- waiting on java.lang.ref.ReferenceQueue$Lock@279208e0
at [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
at [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
at
[email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:170)
"Signal Dispatcher" Id=4 RUNNABLE
...{noformat}
was:
I encountered a similar issue as described on ARTEMIS-4114.
Configuration:
{code:xml}
<ha-policy>
<live-only>
<scale-down>
<connectors>
<connector-ref>ART.GW.CLS1-connector</connector-ref>
<connector-ref>ART.GW.CLS2-connector</connector-ref>
<connector-ref>ART.GW.CLS3-connector</connector-ref>
</connectors>
</scale-down>
</live-only>
</ha-policy>
...
<bridges>
<bridge name="ART.EL.CLS">
<queue-name>ART.EL.CLS</queue-name>
<forwarding-address>ART.EL.CLS</forwarding-address>
<reconnect-attempts>-1</reconnect-attempts>
<static-connectors>
<connector-ref>ART.EL.CLS1-connector</connector-ref>
<connector-ref>ART.EL.CLS2-connector</connector-ref>
<connector-ref>ART.EL.CLS3-connector</connector-ref>
<connector-ref>ART.EL.CLS4-connector</connector-ref>
<connector-ref>ART.EL.CLS5connector</connector-ref>
<connector-ref>ART.EL.CLS6-connector</connector-ref>
</static-connectors>
</bridge>
... {code}
When restarting one broker, another broker connected to it via a bridge crashes
with a deadlock error. Before the crash, the following warnings appeared on the
connected broker:
{noformat}
2025-03-08 02:39:09,089 WARN [org.apache.activemq.artemis.core.server]
AMQ222094: Bridge unable to send message
Reference[314223917984]:NON-RELIABLE:CoreMessage[...]@2117085983, will try
again once bridge reconnects
ActiveMQObjectClosedException[errorType=OBJECT_CLOSED message=AMQ219018:
Producer is closed]
at
org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.checkClosed(ClientProducerImpl.java:310)
at
org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:127)
at
org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:767)
at
org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:628)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:4055)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3191)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4380)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
2025-03-08 02:39:09,089 WARN [org.apache.activemq.artemis.core.server]
AMQ222095: Connection failed with failedOver=false
2025-03-08 02:39:09,089 WARN
[org.apache.activemq.artemis.core.server.impl.QueueImpl] null
java.util.NoSuchElementException
at
org.apache.activemq.artemis.utils.collections.PriorityLinkedListImpl$PriorityLinkedListIterator.repeat(PriorityLinkedListImpl.java:225)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3214)
at
org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4380)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118){noformat}
I have attached the full log. From it, you can see that at **02:41:21**, Broker
1 completely shut down. Then, at **02:41:38**, Broker 2 also stopped with the
following error:
{noformat}
2025-03-08 02:41:38,069 INFO [org.apache.activemq.artemis.core.server]
AMQ224107: The Critical Analyzer detected slow paths on the broker. It is
recommended that you enable trace logs on
org.apache.activemq.artemis.utils.critical while you troubleshoot this issue.
You should disable the trace logs when you have finished troubleshooting.
2025-03-08 02:41:38,069 ERROR [org.apache.activemq.artemis.core.server]
AMQ224079: The process for the virtual machine will be killed, as component
QueueImpl[name=ART.EL.CLS, postOffice=PostOfficeImpl
[server=ActiveMQServerImpl::name=ART.GW.CLS7], temp=false]@3789b8fe is not
responsive
2025-03-08 02:41:40,152 WARN [org.apache.activemq.artemis.core.server]
AMQ222199: Thread dump:
*******************************************************************************
Complete Thread dump
"Reference Handler" Id=2 RUNNABLE
at [email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native
Method)
at
[email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:241)
at
[email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:213)
"Finalizer" Id=3 WAITING on java.lang.ref.ReferenceQueue$Lock@279208e0
at [email protected]/java.lang.Object.wait(Native Method)
- waiting on java.lang.ref.ReferenceQueue$Lock@279208e0
at [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
at [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
at
[email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:170)
"Signal Dispatcher" Id=4 RUNNABLE
...{noformat}
> Deadlock and Crash of Artemis Broker When Restarting a Bridged Peer
> -------------------------------------------------------------------
>
> Key: ARTEMIS-5377
> URL: https://issues.apache.org/jira/browse/ARTEMIS-5377
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker, Clustering
> Affects Versions: 2.31.2
> Reporter: Alexander
> Priority: Major
> Attachments: falled_broker.log, restarder_boker.log
>
>
> I encountered a similar issue as described on ARTEMIS-4114.
> Configuration:
> {code:xml}
> <ha-policy>
> <live-only>
> <scale-down>
> <connectors>
> <connector-ref>ART.GW.CLS1-connector</connector-ref>
> <connector-ref>ART.GW.CLS2-connector</connector-ref>
> <connector-ref>ART.GW.CLS3-connector</connector-ref>
> </connectors>
> </scale-down>
> </live-only>
> </ha-policy>
> ...
> <bridges>
> <bridge name="ART.EL.CLS">
> <queue-name>ART.EL.CLS</queue-name>
> <forwarding-address>ART.EL.CLS</forwarding-address>
> <reconnect-attempts>-1</reconnect-attempts>
> <static-connectors>
> <connector-ref>ART.EL.CLS1-connector</connector-ref>
> <connector-ref>ART.EL.CLS2-connector</connector-ref>
> <connector-ref>ART.EL.CLS3-connector</connector-ref>
> <connector-ref>ART.EL.CLS4-connector</connector-ref>
> <connector-ref>ART.EL.CLS5connector</connector-ref>
> <connector-ref>ART.EL.CLS6-connector</connector-ref>
> </static-connectors>
> </bridge>
> ... {code}
>
> When restarting one broker, another broker connected to it via a bridge
> crashes with a deadlock error. Before the crash, the following warnings
> appeared on the connected broker:
> {noformat}
> 2025-03-08 02:39:09,089 WARN [org.apache.activemq.artemis.core.server]
> AMQ222094: Bridge unable to send message
> Reference[314223917984]:NON-RELIABLE:CoreMessage[...]@2117085983, will try
> again once bridge reconnects
> ActiveMQObjectClosedException[errorType=OBJECT_CLOSED message=AMQ219018:
> Producer is closed]
> at
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.checkClosed(ClientProducerImpl.java:310)
> at
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:127)
> at
> org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:767)
> at
> org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:628)
> at
> org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:4055)
> at
> org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3191)
> at
> org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4380)
> at
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57)
> at
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32)
> at
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
>
> 2025-03-08 02:39:09,089 WARN [org.apache.activemq.artemis.core.server]
> AMQ222095: Connection failed with failedOver=false
> 2025-03-08 02:39:09,089 WARN
> [org.apache.activemq.artemis.core.server.impl.QueueImpl] null
> java.util.NoSuchElementException
> at
> org.apache.activemq.artemis.utils.collections.PriorityLinkedListImpl$PriorityLinkedListIterator.repeat(PriorityLinkedListImpl.java:225)
> at
> org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3214)
> at
> org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4380)
> at
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57)
> at
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32)
> at
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118){noformat}
>
> I have attached the full log. From it, you can see that at 02:41:21 Broker 1
> completely shut down. Then, at 02:41:38 Broker 2 also stopped with the
> following error:
> {noformat}
> 2025-03-08 02:41:38,069 INFO [org.apache.activemq.artemis.core.server]
> AMQ224107: The Critical Analyzer detected slow paths on the broker. It is
> recommended that you enable trace logs on
> org.apache.activemq.artemis.utils.critical while you troubleshoot this issue.
> You should disable the trace logs when you have finished troubleshooting.
> 2025-03-08 02:41:38,069 ERROR [org.apache.activemq.artemis.core.server]
> AMQ224079: The process for the virtual machine will be killed, as component
> QueueImpl[name=ART.EL.CLS, postOffice=PostOfficeImpl
> [server=ActiveMQServerImpl::name=ART.GW.CLS7], temp=false]@3789b8fe is not
> responsive
> 2025-03-08 02:41:40,152 WARN [org.apache.activemq.artemis.core.server]
> AMQ222199: Thread dump:
> *******************************************************************************
> Complete Thread dump
> "Reference Handler" Id=2 RUNNABLE
> at
> [email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native
> Method)
> at
> [email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:241)
> at
> [email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:213)
>
>
> "Finalizer" Id=3 WAITING on java.lang.ref.ReferenceQueue$Lock@279208e0
> at [email protected]/java.lang.Object.wait(Native Method)
> - waiting on java.lang.ref.ReferenceQueue$Lock@279208e0
> at
> [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
> at
> [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
> at
> [email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:170)
>
>
> "Signal Dispatcher" Id=4 RUNNABLE
> ...{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact