Sergey Korotkov created IGNITE-17457:
----------------------------------------
Summary: Cluster locks after the transaction recovery procedure if
the tx primary node fail
Key: IGNITE-17457
URL: https://issues.apache.org/jira/browse/IGNITE-17457
Project: Ignite
Issue Type: Bug
Reporter: Sergey Korotkov
Ignite cluster may be locked (all client operations would block) after the tx
recovery procedure executed on the tx primary node failure.
The prepared transaction may remain un-commited on the backup node after the tx
recovery. So the partition exchange wouldn't complete. So cluster would be
locked.
The Immediate reason is the race condition in the method:
{code:java}
org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter::markFinalizing(RECOVERY_FINISH){code}
It may be called concurrently for the same transaction both from the recovery
procedure:
{code:java}
IgniteTxManager::commitIfPrepared{code}
and from the tx recovery request handler:
{code:java}
IgniteTxHandler::processCheckPreparedTxRequest{code}
Details and reproducer {color:#ff0000}TBD{color}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)