[
https://issues.apache.org/jira/browse/ARTEMIS-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879368#comment-17879368
]
ASF subversion and git services commented on ARTEMIS-5010:
----------------------------------------------------------
Commit 7fb9aa5f97f45e2cb2110bae5bfdd62b51c9980b in activemq-artemis's branch
refs/heads/main from Clebert Suconic
[ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=7fb9aa5f97 ]
ARTEMIS-5010 Addressing deadlock on AckManager
AckManager.flush would hold a lock on ackManager, There was a possible deadlock
with MirrorTarget:
Thread 1:
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.addRetry(AckManager.java:393)
- waiting to lock <0x00000007990a13e8> (a
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.ack(AckManager.java:418)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.performAck(AMQPMirrorControllerTarget.java:479)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.postAcknowledge(AMQPMirrorControllerTarget.java:461)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.actualDelivery(AMQPMirrorControllerTarget.java:318)
at
org.apache.activemq.artemis.protocol.amqp.proton.ProtonAbstractReceiver.onMessageComplete(ProtonAbstractReceiver.java:361)
Thread 2:
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x000000079de0af38> (a
java.util.concurrent.CountDownLatch$Sync)
at
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos([email protected]/AbstractQueuedSynchronizer.java:1079)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos([email protected]/AbstractQueuedSynchronizer.java:1369)
at
java.util.concurrent.CountDownLatch.await([email protected]/CountDownLatch.java:278)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.flush(AMQPMirrorControllerTarget.java:230)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager$$Lambda$601/0x00000008005c3040.accept(Unknown
Source)
at java.lang.Iterable.forEach([email protected]/Iterable.java:75)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.flushMirrorTargets(AckManager.java:184)
- locked <0x00000007990a13e8> (a
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager)
at
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.initRetry(AckManager.java:162)
> AckManager records from mirror are not being replicated
> -------------------------------------------------------
>
> Key: ARTEMIS-5010
> URL: https://issues.apache.org/jira/browse/ARTEMIS-5010
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Affects Versions: 2.37.0
> Reporter: Clebert Suconic
> Priority: Major
> Fix For: 2.38.0
>
> Time Spent: 4h 50m
> Remaining Estimate: 0h
>
> The overall problems is that JournalHashMap is using a local journal most of
> the time and never the replicated journal.
> In this line:
> https://github.com/apache/activemq-artemis/blob/38693370c962f9a4d7fbc9dbeb7bffba18f9ad41/artemis-protocols/artemis-amqp-protocol/src/main/java/org/apache/activemq/artemis/protocol/amqp/connect/mirror/AckManager.java#L82
> We send the current journal.. and if a switch happens inside the
> JournalStorageManager the switch is never captured, and we will never
> replicate pending acks.
> When a failover happens the replica will not have the retries and the system
> will not reattempt them.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact