[jira] [Commented] (ARTEMIS-5010) AckManager records from mirror are not being replicated

ASF subversion and git services (Jira) Wed, 04 Sep 2024 13:23:29 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879368#comment-17879368
 ]


ASF subversion and git services commented on ARTEMIS-5010:
----------------------------------------------------------

Commit 7fb9aa5f97f45e2cb2110bae5bfdd62b51c9980b in activemq-artemis's branch 
refs/heads/main from Clebert Suconic
[ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=7fb9aa5f97 ]

ARTEMIS-5010 Addressing deadlock on AckManager

AckManager.flush would hold a lock on ackManager, There was a possible deadlock 
with MirrorTarget:

Thread 1:

        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.addRetry(AckManager.java:393)
        - waiting to lock <0x00000007990a13e8> (a 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.ack(AckManager.java:418)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.performAck(AMQPMirrorControllerTarget.java:479)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.postAcknowledge(AMQPMirrorControllerTarget.java:461)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.actualDelivery(AMQPMirrorControllerTarget.java:318)
        at 
org.apache.activemq.artemis.protocol.amqp.proton.ProtonAbstractReceiver.onMessageComplete(ProtonAbstractReceiver.java:361)

Thread 2:

        at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
        - parking to wait for  <0x000000079de0af38> (a 
java.util.concurrent.CountDownLatch$Sync)
        at 
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos([email protected]/AbstractQueuedSynchronizer.java:1079)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos([email protected]/AbstractQueuedSynchronizer.java:1369)
        at 
java.util.concurrent.CountDownLatch.await([email protected]/CountDownLatch.java:278)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AMQPMirrorControllerTarget.flush(AMQPMirrorControllerTarget.java:230)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager$$Lambda$601/0x00000008005c3040.accept(Unknown
 Source)
        at java.lang.Iterable.forEach([email protected]/Iterable.java:75)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.flushMirrorTargets(AckManager.java:184)
        - locked <0x00000007990a13e8> (a 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager)
        at 
org.apache.activemq.artemis.protocol.amqp.connect.mirror.AckManager.initRetry(AckManager.java:162)


> AckManager records from mirror are not being replicated
> -------------------------------------------------------
>
>                 Key: ARTEMIS-5010
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-5010
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>    Affects Versions: 2.37.0
>            Reporter: Clebert Suconic
>            Priority: Major
>             Fix For: 2.38.0
>
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> The overall problems is that JournalHashMap is using a local journal most of 
> the time and never the replicated journal.
> In this line:
> https://github.com/apache/activemq-artemis/blob/38693370c962f9a4d7fbc9dbeb7bffba18f9ad41/artemis-protocols/artemis-amqp-protocol/src/main/java/org/apache/activemq/artemis/protocol/amqp/connect/mirror/AckManager.java#L82
> We send the current journal.. and if a switch happens inside the 
> JournalStorageManager the switch is never captured, and we will never 
> replicate pending acks.
> When a failover happens the replica will not have the retries and the system 
> will not reattempt them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact

[jira] [Commented] (ARTEMIS-5010) AckManager records from mirror are not being replicated

Reply via email to