[
https://issues.apache.org/jira/browse/IGNITE-20560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Sizov reassigned IGNITE-20560:
--------------------------------------
Assignee: Kirill Sizov
> It's possible to execute commands on a finished transaction under certain
> circumstances
> ---------------------------------------------------------------------------------------
>
> Key: IGNITE-20560
> URL: https://issues.apache.org/jira/browse/IGNITE-20560
> Project: Ignite
> Issue Type: Task
> Reporter: Kirill Sizov
> Assignee: Kirill Sizov
> Priority: Major
> Labels: ignite-3
> Time Spent: 10m
> Remaining Estimate: 0h
>
> If a cleanup operation crashes, it does not affect the transaction it was for
> called since the transaction has been finished already.
> However under certain circumstances we may *get the validation that prevents
> commands from being executed on a finished transaction broken.*
> The issue is that we have
> {{PartitionReplicaListener.TxCleanupReadyFutureList.state}} that duplicates
> local txState, and is updated in the cleanup command handler.
> *Details*
> {{PartitionReplicaListener.TxCleanupReadyFutureList.state}} is
> * +updated+ in {{PartitionReplicaListener.processTxCleanupAction}} and
> * +read+ in {{{}PartitionReplicaListener.appendTxCommand{}}}.
> If the update has not been called because of a crash, the code in
> {{{}appendTxCommand{}}}:
> {code:java}
> txCleanupReadyFutures.compute(txId, (id, txOps) -> {
> if (txOps == null) {
> txOps = new TxCleanupReadyFutureList();
> }
> if (isFinalState(txOps.state)) {
> fut.completeExceptionally(
> new
> TransactionException(TX_FAILED_READ_WRITE_OPERATION_ERR, "Transaction is
> already finished."));
> } else {
> txOps.futures.computeIfAbsent(cmdType, type -> new
> ArrayList<>()).add(fut);
> }
> return txOps;
> });{code}
> will still read {{txOps.state}} as {{PENDING}} and will allow to execute this
> command instead of throwing a {{{}TransactionException{}}}.
> *Motivation*
> The identified issue is that we save a transaction state in two places in a
> primary replica:
> * in txCleanupReadyFutures with cleanup futures,
> * in TxManager#txStateMap where we cached it in the tx state local map.
> *Definition of done*
> # We should remove the transaction state from TxCleanupReadyFutureList and
> use the state that is stored in TxManager and available through
> TxManagerImpl#stateMeta(txId)
> # To resolve tests where we are suppressing TxCleanupReplicaReques
> (ItTxDistributedTestSingleNodeNoCleanupMessage#testTransactionAlreadyRolledback,
> #testTransactionAlreadyCommitted) we have to check the transaction state map
> on the transaction coordinator before trying to enlist a new operation in the
> transaction. Despite the fact that the tests are not completely honest, we
> will avoid executing an operation after a commit timestamp is already chosen.
> # The tests from point 2 should be rewritten because we do not assume to just
> drop TxCleanupReplicaRequest (the handling of TxCleanupReplicaRequest is
> customized so as to skip it) and complete the commit invocation.
> *_Please note there are tests muted with this task._*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)