[
https://issues.apache.org/jira/browse/ARTEMIS-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871142#comment-17871142
]
Timothy A. Bish commented on ARTEMIS-4970:
------------------------------------------
Some way to reproduce this would be beneficial, however you can disabled core
tunneling in your XML configuration using something like the following to try
and work around the issue:
{code:xml}
<amqp-connection uri="tcp://localhost:5672" name="test" auto-start="true">
<mirror>
<property key="tunnel-core-messages" value="false"/>
</mirror>
</amqp-connection>
{code}
> IndexOutOfBoundsException in AMQP tunnelling of Core Messages and permanent
> stop of message replication via mirroring
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: ARTEMIS-4970
> URL: https://issues.apache.org/jira/browse/ARTEMIS-4970
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: AMQP, Broker
> Affects Versions: 2.35.0
> Reporter: Jean-Pascal Briquet
> Priority: Major
>
> The IndexOutOfBoundsException error occurs randomly when messages are being
> replicated via async mirroring.
> Several thousands of messages can be replicated successfully before it
> happens.
> I have no reproduction scenario yet, as it is random but it happens several
> times per day.
> If needed, specific logging level can be enabled if that helps with the
> investigation.
>
> *Artemis setup:*
> The Artemis topology is composed by two Artemis clusters (of 3 groups) with
> ZK quorum (primary/backup).
> Dual async mirroring is enabled on queues on both clusters.
>
> *IndexOutOfBound error details*
> Most messages going through the replication link are of standard size and
> originated by Openwire or Core protocol. Large messages, averaging 150KB can
> be replicated too but are less frequent.
> Please note that the message is altered by an interceptor to add property
> "_BT_MAX_DELIVERY" when it reaches the broker.
> The message embedded in the stack trace below appears to have been
> redistributed within the cluster before being replicated, as user is
> ACTIVEMQ.CLUSTER.ADMIN.USER. I have seen it failing in non-redistributed
> scenario too.
> {code:java}
> 2024-08-02 22:01:46,056 WARN [org.apache.activemq.artemis.core.server]
> AMQ222151: removing consumer which did not handle a message,
> consumer=ServerConsumerImpl [id=0, filter=null, binding=LocalQueueBinding
> [address=$ACTIVEMQ_ARTEMIS_MIRROR_dc2-group-1,
> queue=QueueImpl[name=$ACTIVEMQ_ARTEMIS_MIRROR_dc2-group-1,
> postOffice=PostOfficeImpl
> [server=ActiveMQServerImpl::name=artemis-dc1-primary-1],
> temp=false]@1f5af8e5, filter=null, name=$ACTIVEMQ_ARTEMIS_MIRROR_dc2-group-1,
> clusterName=$ACTIVEMQ_ARTEMIS_MIRROR_dc2-group-154a6ae45-26e5-11ee-837c-506b8d97040b],
> closed=false],
> message=Reference[279848883113]:RELIABLE:CoreMessage[messageID=279848883113,
> durable=true, userID=565b82e5-513c-11ef-aaa1-fe3d2d71403d, priority=4,
> timestamp=Fri Aug 02 22:01:46 EDT 2024, expiration=0, durable=true,
> address=queue.ua.release.shared.events.payment-services.internal, size=3071,
> properties=TypedProperties[traceparent=00-2630f625064718026d4283d98d757d38-052e1857dcc761da-00,
> __AMQ_CID=d519d3e7-50ca-11ef-aaa1-fe3d2d71403d,
> elastic_apm_traceparent=00-2630f625064718026d4283d98d757d38-052e1857dcc761da-00,
> _AMQ_ROUTING_TYPE=1, _BT_MAX_DELIVERY=30,
> JMSCorrelationID=O202408030401340001Q_SEU_SELF-9cd96504-f1df-42d2-8d89-c26e79cb44cd,
> _AMQ_VALIDATED_USER=ACTIVEMQ.CLUSTER.ADMIN.USER]]@1105557896
> java.lang.IndexOutOfBoundsException: readerIndex: 0, writerIndex: 3288
> (expected: 0 <= readerIndex <= writerIndex <= capacity(3083))
> at
> io.netty.buffer.AbstractByteBuf.checkIndexBounds(AbstractByteBuf.java:112)
> ~[netty-buffer-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.buffer.AbstractByteBuf.writerIndex(AbstractByteBuf.java:135)
> ~[netty-buffer-4.1.111.Final.jar:4.1.111.Final]
> at
> org.apache.activemq.artemis.protocol.amqp.proton.AMQPTunneledCoreMessageWriter.writeBytes(AMQPTunneledCoreMessageWriter.java:107)
> [artemis-amqp-protocol-2.35.0.jar:2.35.0]
> at
> org.apache.activemq.artemis.protocol.amqp.proton.MessageWriter.accept(MessageWriter.java:41)
> [artemis-amqp-protocol-2.35.0.jar:2.35.0]
> at
> org.apache.activemq.artemis.protocol.amqp.proton.MessageWriter.accept(MessageWriter.java:28)
> [artemis-amqp-protocol-2.35.0.jar:2.35.0]
> at
> org.apache.activemq.artemis.core.server.impl.MessageReferenceImpl.run(MessageReferenceImpl.java:136)
> [artemis-server-2.35.0.jar:2.35.0]
> at
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:405)
> [netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> [artemis-commons-2.35.0.jar:2.35.0] {code}
> *Mirroring consumer stopped*
> The error triggers another problem where message replication is immediately
> stopped because the mirroring consumer is destroyed and is never recreated
> automatically (null pointer problem?).
> A workaround found is to close AMQP broker connections, which will trigger
> broker connection to restart and automatically restart mirroring consumer.
> Stacktrace of consumer failing to reconnect after the IndexOutOfBound
> Exception:
> {code:java}
> 2024-08-02 22:01:46,057 WARN
> [io.netty.util.concurrent.AbstractEventExecutor] A task raised an exception.
> Task:
> org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback$$Lambda$1097/0x00007ff21c82e000@261d5316
> java.lang.NullPointerException: Cannot invoke
> "org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerSenderContext.close(org.apache.qpid.proton.amqp.transport.ErrorCondition)"
> because the return value of
> "org.apache.activemq.artemis.core.server.ServerConsumer.getProtocolContext()"
> is null
> at
> org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.lambda$disconnect$5(AMQPSessionCallback.java:747)
> ~[artemis-amqp-protocol-2.35.0.jar:2.35.0]
> at
> io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
> ~[netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:405)
> [netty-transport-classes-epoll-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> [netty-common-4.1.111.Final.jar:4.1.111.Final]
> at
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> [artemis-commons-2.35.0.jar:2.35.0]{code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact