[ 
https://issues.apache.org/jira/browse/ARTEMIS-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631896#comment-17631896
 ] 

Justin Bertram commented on ARTEMIS-3992:
-----------------------------------------

I'm not sure where/when the issue was fixed as I'm not sure what the root cause 
actually was.

Aside from that, where are we on this issue? Given the lack of activity I 
assume everything is working as expected now. Can you confirm?

> Store corruption and broker instabillty with rollback of XA transactions
> ------------------------------------------------------------------------
>
>                 Key: ARTEMIS-3992
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3992
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.16.0
>            Reporter: SL
>            Priority: Major
>
> Edit : i had bad information about the time of the upgrade to 2.24.0, it was 
> repeated just before upgrade, final status of the issue pending.
> We are experiancing a major stability issue with artemis which seems 
> triggered by expired XA transactions.
> It starts with a bunch of timeouts like
> {noformat}
> 2022-09-13 00:00:02,970 WARN  [org.apache.activemq.artemis.core.server] 
> AMQ222103: transaction with xid XidImpl (2133539424 (...) timed out{noformat}
> Then a lot of recurring exceptions on the persistent store
> {noformat}
> MQ222055: Error on deleting duplicate cache: java.lang.IllegalStateException: 
> Cannot find add info 228196096 on compactor or current records
>         at 
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkKnownRecordID(JournalImpl.java:1152)
>  [artemis-journal-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendDeleteRecord(JournalImpl.java:989)
>  [artemis-journal-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.deleteDuplicateID(AbstractJournalStorageManager.java:482)
>  [artemis-server-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.postoffice.impl.DuplicateIDCacheImpl.addToCacheInMemory(DuplicateIDCacheImpl.java:265)
>  [artemis-server-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.postoffice.impl.DuplicateIDCacheImpl.access$000(DuplicateIDCacheImpl.java:41)
>  [artemis-server-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.postoffice.impl.DuplicateIDCacheImpl$AddDuplicateIDOperation.process(DuplicateIDCacheImpl.java:347)
>  [artemis-server-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.postoffice.impl.DuplicateIDCacheImpl$AddDuplicateIDOperation.beforeCommit(DuplicateIDCacheImpl.java:363)
>  [artemis-server-2.16.0.jar:2.16.0]
>         at 
> org.apache.activemq.artemis.core.transaction.impl.TransactionImpl.beforeCommit(TransactionImpl.java:599)
>  [artemis-server-2.16.0
> {noformat}
> From client side the consuming seems to slow down and at some point stops 
> completely.
> The broker can partialy recover with a restart but its seems be still have 
> issues if not given a new clean and empty persistant store.
> (Note : it might be similar to ARTEMIS-2373)
> Background :
> - It's a standalone artemis instance serving as front for other brokers 
> (connected by bridges, working fine). It forwards messages submitted by 
> clients to brokers connected to applications services and get back response 
> messages which are consumed by the clients (basically a kind of reverse 
> proxy).
> - It has been recently upgraded to 2.24.0 hoping that would fix the issue, 
> but it remains identical.
> - It's a production system, the issue have not yet been reproduced on test 
> environments (but it is repeated several times on this production environment)
> - We do not own the client trying to consume the messages and have little 
> information on the specifics of its internals and XA usage.
> - Clients not using XA did not exhibit this kind of issue using the services 
> for months, even years.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to