[ https://issues.apache.org/jira/browse/IGNITE-21861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Chudov updated IGNITE-21861: ---------------------------------- Description: Exception in log: {code:java} [2024-03-27T01:24:46,636][WARN ][%idt_n_1%partition-operations-4][ReplicaManager] Failed to process replica request [request=ReadWriteScanRetrieveBatchReplicaRequestImpl [batchSize=512, columnsToInclude=null, commitPartitionId=TablePartitionIdMessageImpl [partitionId=17, tableId=90], coordinatorId=125b397c-0404-4dcf-a28b-625fe010ecef, enlistmentConsistencyToken=112165039282455690, exactKey=null, flags=0, full=false, groupId=92_part_7, indexToUse=null, lowerBoundPrefix=null, scanId=20361, timestampLong=112165039967305730, transactionId=018e7d82-647b-0030-63a2-6a1900000001, upperBoundPrefix=null]]. java.util.concurrent.CompletionException: org.apache.ignite.tx.TransactionException: IGN-TX-14 TraceId:6612dad8-4a32-4453-8af0-0139e336aad9 Transaction is already finished. at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) ~[?:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:660) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequestWithTxRwCounter(PartitionReplicaListener.java:3860) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processRequest$5(PartitionReplicaListener.java:436) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) [?:?] at java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610) [?:?] at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649) [?:?] at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.base/java.lang.Thread.run(Thread.java:834) [?:?] Caused by: org.apache.ignite.tx.TransactionException: Transaction is already finished. at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.appendTxCommand(PartitionReplicaListener.java:1937) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:659) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] ... 10 more{code} It happens in PartitionReplicaListener because the local volatile tx state is null or final when trying to compute a value for txCleanupReadyFutures map: {code:java} txCleanupReadyFutures.compute(txId, (id, txOps) -> { // First check whether the transaction has already been finished. // And complete cleanupReadyFut with exception if it is the case. TxStateMeta txStateMeta = txManager.stateMeta(txId); if (txStateMeta == null || isFinalState(txStateMeta.txState())) { cleanupReadyFut.completeExceptionally(new Exception()); return txOps; } // Otherwise collect cleanupReadyFut in the transaction's futures. if (txOps == null) { txOps = new TxCleanupReadyFutureList(); } txOps.futures.computeIfAbsent(cmdType, type -> new HashMap<>()).put(opId, cleanupReadyFut); return txOps; }); if (cleanupReadyFut.isCompletedExceptionally()) { return failedFuture(new TransactionException(TX_ALREADY_FINISHED_ERR, "Transaction is already finished.")); }{code} First problem is that we don't actually know the real state from this exception. The second one is the exception itself, because it shouldn't happen. We shouldn't meet a null state, because it's updated to pending just before, and it can be vacuumized only after it becomes final. was: Exception in log: {code:java} [2024-03-27T01:24:46,636][WARN ][%idt_n_1%partition-operations-4][ReplicaManager] Failed to process replica request [request=ReadWriteScanRetrieveBatchReplicaRequestImpl [batchSize=512, columnsToInclude=null, commitPartitionId=TablePartitionIdMessageImpl [partitionId=17, tableId=90], coordinatorId=125b397c-0404-4dcf-a28b-625fe010ecef, enlistmentConsistencyToken=112165039282455690, exactKey=null, flags=0, full=false, groupId=92_part_7, indexToUse=null, lowerBoundPrefix=null, scanId=20361, timestampLong=112165039967305730, transactionId=018e7d82-647b-0030-63a2-6a1900000001, upperBoundPrefix=null]].java.util.concurrent.CompletionException: org.apache.ignite.tx.TransactionException: IGN-TX-14 TraceId:6612dad8-4a32-4453-8af0-0139e336aad9 Transaction is already finished. at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) ~[?:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:660) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequestWithTxRwCounter(PartitionReplicaListener.java:3860) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processRequest$5(PartitionReplicaListener.java:436) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) [?:?] at java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610) [?:?] at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649) [?:?] at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.base/java.lang.Thread.run(Thread.java:834) [?:?]Caused by: org.apache.ignite.tx.TransactionException: Transaction is already finished. at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.appendTxCommand(PartitionReplicaListener.java:1937) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:659) ~[ignite-table-3.0.0-SNAPSHOT.jar:?] ... 10 more {code} > Unexpected "Transaction is already finished" exception > ------------------------------------------------------- > > Key: IGNITE-21861 > URL: https://issues.apache.org/jira/browse/IGNITE-21861 > Project: Ignite > Issue Type: Bug > Reporter: Denis Chudov > Priority: Major > Labels: ignite-3 > > Exception in log: > {code:java} > [2024-03-27T01:24:46,636][WARN > ][%idt_n_1%partition-operations-4][ReplicaManager] Failed to process replica > request [request=ReadWriteScanRetrieveBatchReplicaRequestImpl [batchSize=512, > columnsToInclude=null, commitPartitionId=TablePartitionIdMessageImpl > [partitionId=17, tableId=90], > coordinatorId=125b397c-0404-4dcf-a28b-625fe010ecef, > enlistmentConsistencyToken=112165039282455690, exactKey=null, flags=0, > full=false, groupId=92_part_7, indexToUse=null, lowerBoundPrefix=null, > scanId=20361, timestampLong=112165039967305730, > transactionId=018e7d82-647b-0030-63a2-6a1900000001, upperBoundPrefix=null]]. > java.util.concurrent.CompletionException: > org.apache.ignite.tx.TransactionException: IGN-TX-14 > TraceId:6612dad8-4a32-4453-8af0-0139e336aad9 Transaction is already finished. > at > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) > ~[?:?] > at > java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099) > ~[?:?] > at > java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) > ~[?:?] > at > org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:660) > ~[ignite-table-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequestWithTxRwCounter(PartitionReplicaListener.java:3860) > ~[ignite-table-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processRequest$5(PartitionReplicaListener.java:436) > ~[ignite-table-3.0.0-SNAPSHOT.jar:?] > at > java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) > ~[?:?] > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) > [?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at java.base/java.lang.Thread.run(Thread.java:834) [?:?] > Caused by: org.apache.ignite.tx.TransactionException: Transaction is already > finished. > at > org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.appendTxCommand(PartitionReplicaListener.java:1937) > ~[ignite-table-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:659) > ~[ignite-table-3.0.0-SNAPSHOT.jar:?] > ... 10 more{code} > It happens in PartitionReplicaListener because the local volatile tx state is > null or final when trying to compute a value for txCleanupReadyFutures map: > {code:java} > txCleanupReadyFutures.compute(txId, (id, txOps) -> { > // First check whether the transaction has already been finished. > // And complete cleanupReadyFut with exception if it is the case. > TxStateMeta txStateMeta = txManager.stateMeta(txId); > if (txStateMeta == null || isFinalState(txStateMeta.txState())) { > cleanupReadyFut.completeExceptionally(new Exception()); > return txOps; > } > // Otherwise collect cleanupReadyFut in the transaction's futures. > if (txOps == null) { > txOps = new TxCleanupReadyFutureList(); > } > txOps.futures.computeIfAbsent(cmdType, type -> new HashMap<>()).put(opId, > cleanupReadyFut); > return txOps; > }); > if (cleanupReadyFut.isCompletedExceptionally()) { > return failedFuture(new TransactionException(TX_ALREADY_FINISHED_ERR, > "Transaction is already finished.")); > }{code} > First problem is that we don't actually know the real state from this > exception. > The second one is the exception itself, because it shouldn't happen. We > shouldn't meet a null state, because it's updated to pending just before, and > it can be vacuumized only after it becomes final. > -- This message was sent by Atlassian Jira (v8.20.10#820010)