[
https://issues.apache.org/jira/browse/ARTEMIS-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marc Leisi reopened ARTEMIS-5806:
---------------------------------
Another comment was added and might be worth reviewing before closing the issue.
> Message loss due to XA session rollback after broker restart
> ------------------------------------------------------------
>
> Key: ARTEMIS-5806
> URL: https://issues.apache.org/jira/browse/ARTEMIS-5806
> Project: Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.40.0, 2.44.0
> Reporter: Marc Leisi
> Priority: Critical
> Attachments: MessageLossAfterRestart.png
>
>
> In our setup, an MDB deployed in an Oracle WebLogic container connects to an
> ActiveMQ Artemis broker using XA transactions. To receive messages, the
> WebLogic MDB framework repeatedly polls by opening an XA transaction
> ({{{}xaStart{}}}), performing a {{{}receive(timeout){}}}, and then closing
> the transaction ({{{}xaEnd{}}}). If a message was received, the transaction
> is prepared and committed, otherwise rollbacked.
> During a graceful broker shudown, all active transactions and sessions on the
> broker are closed. That part works as expected. However, after the restart we
> encounter a problematic behavior:
> The MDB begins polling again ({{{}xaStart{}}} + {{{}receive(timeout){}}}).
> Before the receive() timeouts, in parallel the WebLogic JTA framework tries
> to finish the open transaction (started before the shutdown). This is done in
> the same session as the MDB polling. Since that transaction no longer exists
> on the broker, {{xaEnd}} fails with {_}"Cannot find suspended transaction to
> end"{_}. WebLogic JTA forces a {{{}xaRollback{}}}, which also fails with
> {_}"Cannot find xid in resource manager"{_}. On the broker side, the session
> is rollbacked (see:
> [ServerSessionImpl.java#L1627|https://github.com/apache/artemis/blob/fa1da6e6301fd89f7ec6dcdb98fd4366597082fa/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/ServerSessionImpl.java#L1627]).
> The session rollback will cancel all open transactions in the session,
> including the ongoing MDB polling transaction.
> The real problem occurs afterwards, if a new message is produced and ready to
> be delivered to the MDB poller ( receive(timeout)). Artemis delivers the
> message, the MDB poller tries to end ({{{}xaEnd{}}}) the transaction. Because
> the transaction was already removed during the previous session rollback,
> this results in {_}"Cannot find suspended transaction to end"{_}. The MDB
> poller will force a global rollback, it drops the message and attempts to
> roll back on Artemis broker, which also fails ({_}"Cannot find xid in
> resource manager"{_}). As a result, on the Artemis broker the message is
> lost: it is removed from the queue, and there are no open prepared
> transaction for it anymore.
> Here is a short version of the flow (A simple sequence diagram is attached as
> well):
> {code:java}
> xaStart(xid1) (session1)
> receive(timeout)
> — broker restart —
> xaStart(xid2) (session2)
> receive(timeout)
> xaEnd(xid1) (session2)
> — Cannot find suspended transaction to end
> xaRollback(xid1) (session2)
> — Cannot find xid in resource manager--- removes remove xid1 & all xids in
> session
> (including xid2)
> message — receive with xid2
> xaEnd(xid2)
> — Cannot find suspended transaction to end
> xaRollback(xid2)
> — Cannot find xid in resource manager
> message dropped due to exception, message no longer on queue and no
> transaction on artemis left
> {code}
> To reproduce this behavior, I adapted the XA receive example in a fork:
> [https://github.com/leisma/activemq-artemis-examples/commit/61deb9832eefeda360ff3207b3ad8e56c4ea2aa6\|https://github.com/leisma/activemq-artemis-examples/commit/61deb9832eefeda360ff3207b3ad8e56c4ea2aa6%5C]
> (You need to run a broker separately to execute it)
> I’m not sure whether Artemis implicitly assumes that only one XA transaction
> may exist per session. I could not find clear guidance in the JTA
> specification or other documentation regarding how XA transactions should
> behave in this scenario.
> Is this the expected behavior?
> Or would it be possible for Artemis to check whether a session still contains
> active transactions before performing a rollback, which would prevent the
> message loss we are seeing?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]