[ 
https://issues.apache.org/jira/browse/ARTEMIS-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044459#comment-18044459
 ] 

Justin Bertram edited comment on ARTEMIS-5806 at 12/11/25 3:25 PM:
-------------------------------------------------------------------

The spec you cited only says that concurrent calls are only allowed for 
"transaction commit processing" (i.e. calling {{commit()}}). It also specifies 
that, "...one transaction is enlisted with the resource at any given time." The 
problem I see in your code is that you are working with multiple transactions 
on the same resource concurrently. For example, you don't call {{end(xid2)}} 
before attempting to call {{end(xid1)}} and {{rollback(xid1)}}.

Why doesn't the WLS JTA Recovery component simply use its own {{XAConnection}}, 
{{XASession}}, and {{XAResource}}? I'm pretty certain this is what WildFly 
does, for example. The spec you cited also says:

bq. To initiate the transaction commit process, the transaction manager is 
allowed to use any of the resource objects connected to the same resource 
manager instance. The resource object used for the two-phase commit protocol 
need not have been involved with the transaction being completed.


was (Author: jbertram):
The spec you cited only says that concurrent calls are only allowed for 
"transaction commit processing" (i.e. calling {{commit()}}). It also specifies 
that, "...one transaction is enlisted with the resource at any given time." The 
problem I see in your code is that you are working with multiple transactions 
on the same resource concurrently. For example, you don't call {{end(xid2)}} 
before attempting to call {{end(xid1)}} and {{rollback(xid1}}.

Why doesn't the WLS JTA Recovery component simply use its own {{XAConnection}}, 
{{XASession}}, and {{XAResource}}? I'm pretty certain this is what WildFly 
does, for example. The spec you cited also says:

bq. To initiate the transaction commit process, the transaction manager is 
allowed to use any of the resource objects connected to the same resource 
manager instance. The resource object used for the two-phase commit protocol 
need not have been involved with the transaction being completed.

> Message loss due to XA session rollback after broker restart
> ------------------------------------------------------------
>
>                 Key: ARTEMIS-5806
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-5806
>             Project: Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.40.0, 2.44.0
>            Reporter: Marc Leisi
>            Priority: Critical
>         Attachments: MessageLossAfterRestart.png
>
>
> In our setup, an MDB deployed in an Oracle WebLogic container connects to an 
> ActiveMQ Artemis broker using XA transactions. To receive messages, the 
> WebLogic MDB framework repeatedly polls by opening an XA transaction 
> ({{{}xaStart{}}}), performing a {{{}receive(timeout){}}}, and then closing 
> the transaction ({{{}xaEnd{}}}). If a message was received, the transaction 
> is prepared and committed, otherwise rollbacked.
> During a graceful broker shudown, all active transactions and sessions on the 
> broker are closed. That part works as expected. However, after the restart we 
> encounter a problematic behavior:
> The MDB begins polling again ({{{}xaStart{}}} + {{{}receive(timeout){}}}). 
> Before the receive() timeouts, in parallel the WebLogic JTA framework tries 
> to finish the open transaction (started before the shutdown). This is done in 
> the same session as the MDB polling. Since that transaction no longer exists 
> on the broker, {{xaEnd}} fails with {_}"Cannot find suspended transaction to 
> end"{_}. WebLogic JTA forces a {{{}xaRollback{}}}, which also fails with 
> {_}"Cannot find xid in resource manager"{_}. On the broker side, the session 
> is rollbacked (see: 
> [ServerSessionImpl.java#L1627|https://github.com/apache/artemis/blob/fa1da6e6301fd89f7ec6dcdb98fd4366597082fa/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/ServerSessionImpl.java#L1627]).
>  The session rollback will cancel all open transactions in the session, 
> including the ongoing MDB polling transaction.
> The real problem occurs afterwards, if a new message is produced and ready to 
> be delivered to the MDB poller ( receive(timeout)). Artemis delivers the 
> message, the MDB poller tries to end ({{{}xaEnd{}}}) the transaction. Because 
> the transaction was already removed during the previous session rollback, 
> this results in {_}"Cannot find suspended transaction to end"{_}. The MDB 
> poller will force a global rollback, it drops the message and attempts to 
> roll back on Artemis broker, which also fails ({_}"Cannot find xid in 
> resource manager"{_}). As a result, on the Artemis broker the message is 
> lost: it is removed from the queue, and there are no open prepared 
> transaction for it anymore.
> Here is a short version of the flow (A simple sequence diagram is attached as 
> well):
> {code:java}
> xaStart(xid1) (session1)
> receive(timeout) 
> — broker restart —
> xaStart(xid2) (session2)
> receive(timeout) 
> xaEnd(xid1) (session2)
> — Cannot find suspended transaction to end
> xaRollback(xid1) (session2)
> — Cannot find xid in resource manager--- removes remove xid1 & all xids in 
> session
> (including xid2)
> message — receive with xid2 
> xaEnd(xid2)
> — Cannot find suspended transaction to end
> xaRollback(xid2)
> — Cannot find xid in resource manager
> message dropped due to exception, message no longer on queue and no 
> transaction on artemis left
> {code}
> To reproduce this behavior, I adapted the XA receive example in a fork:
> [https://github.com/leisma/activemq-artemis-examples/commit/61deb9832eefeda360ff3207b3ad8e56c4ea2aa6\|https://github.com/leisma/activemq-artemis-examples/commit/61deb9832eefeda360ff3207b3ad8e56c4ea2aa6%5C]
>  (You need to run a broker separately to execute it)
> I’m not sure whether Artemis implicitly assumes that only one XA transaction 
> may exist per session. I could not find clear guidance in the JTA 
> specification or other documentation regarding how XA transactions should 
> behave in this scenario.
> Is this the expected behavior?
> Or would it be possible for Artemis to check whether a session still contains 
> active transactions before performing a rollback, which would prevent the 
> message loss we are seeing?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to