[
https://issues.apache.org/jira/browse/IGNITE-23926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904363#comment-17904363
]
Kirill Tkalenko edited comment on IGNITE-23926 at 12/10/24 6:47 AM:
--------------------------------------------------------------------
The error occurs because the test does not have time to finish within the
timeout and starts to fail.
Which results in the execution of sql *DROP TABLE Person*, so transactions that
have not yet finished may have time to see a new version of the catalog in
which the table with its indexes has already been deleted.
Here are the log messages that led to this conclusion:
{noformat}
[05:21:18] : [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,676][INFO
][%ijbst_n_0%JRaft-FSMCaller-Disruptor_stripe_4-0][TableUtils] >>>>>
TableUtils#indexIdsAtRwTxBeginTs: [txId=0193af03-bb5f-0020-de3d-986200000001,
txBTs=HybridTimestamp [physical=2024-12-10 05:21:17:663 +0000, logical=32,
composite=113626846177722400], tableId=16, catalogVersion=7, indexSize=1]
[05:21:18] : [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,691][INFO ][Test worker][ItJdbcBatchSelfTest] >>> Stopping
test: ItJdbcBatchSelfTest#testHeterogeneousBatchException, displayName:
testHeterogeneousBatchException(), cost: 2043ms.
[05:21:18] : [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,691][INFO ][Test worker][JdbcStatement] >>>>>
JdbcStatement#execute: [sql=DROP TABLE Person;, args=[]]
[05:21:18]W: [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,799][WARN
][%ijbst_n_0%partition-operations-15][ReplicaManager] Failed to process replica
request [request=WriteIntentSwitchReplicaRequestImpl [commit=false,
commitTimestamp=null, groupId=TablePartitionIdMessageImpl [partitionId=16,
tableId=16], timestamp=HybridTimestamp [physical=2024-12-10 05:21:17:798 +0000,
logical=1, composite=113626846186569729],
txId=0193af03-bbe4-0002-de3d-986200000001]].
[05:21:18]W: [:ignite-jdbc:integrationTest]
java.util.concurrent.CompletionException: java.lang.AssertionError:
txId=0193af03-bbe4-0002-de3d-986200000001, tableId=16, catalogVersion=8
{noformat}
What can be done?
* Improve error message in
*org.apache.ignite.internal.table.distributed.TableUtils#indexIdsAtRwTxBeginTs*
something like: there are no indexes because the table was deleted.
* Somewhere set the *catalogVersion* at the moment of the transaction start/
and then work with it everywhere, but here we need to check whether there will
be problems if we create new indexes in which we could write data.
* Improve error logging:
**
{noformat}
Failed to process replica request [request=ReadWriteMultiRowReplicaRequestImpl
[binaryTuples=ArrayList [java.nio.HeapByteBuffer[pos=0 lim=21 cap=21]],
commitPartitionId=TablePartitionIdMessageImpl [partitionId=4, tableId=16],
coordinatorId=04805429-748a-44ef-af63-b4a7e5fc1917, deleted=null,
enlistmentConsistencyToken=113626726969180168, full=false,
groupId=TablePartitionIdMessageImpl [partitionId=24, tableId=16],
requestType=RW_INSERT_ALL, schemaVersion=1, skipDelayedAck=true, tableId=16,
timestamp=HybridTimestamp [physical=2024-12-10 05:21:17:789 +0000, logical=3,
composite=113626846185979907],
transactionId=0193af03-bbdc-0020-de3d-986200000001]].
{noformat}
there is no need to output a list of tuples here
** in total, the logs are 5 GB and they contain a lot of similar messages with
the same stack trace, which makes the logs very loaded and impossible to read.
was (Author: [email protected]):
The error occurs because the test does not have time to finish within the
timeout and starts to fail.
Which results in the execution of sql *DROP TABLE Person*, so transactions that
have not yet finished may have time to see a new version of the catalog in
which the table with its indexes has already been deleted.
Here are the log messages that led to this conclusion:
{noformat}
[05:21:18] : [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,676][INFO
][%ijbst_n_0%JRaft-FSMCaller-Disruptor_stripe_4-0][TableUtils] >>>>>
TableUtils#indexIdsAtRwTxBeginTs: [txId=0193af03-bb5f-0020-de3d-986200000001,
txBTs=HybridTimestamp [physical=2024-12-10 05:21:17:663 +0000, logical=32,
composite=113626846177722400], tableId=16, catalogVersion=7, indexSize=1]
[05:21:18] : [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,691][INFO ][Test worker][ItJdbcBatchSelfTest] >>> Stopping
test: ItJdbcBatchSelfTest#testHeterogeneousBatchException, displayName:
testHeterogeneousBatchException(), cost: 2043ms.
[05:21:18] : [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,691][INFO ][Test worker][JdbcStatement] >>>>>
JdbcStatement#execute: [sql=DROP TABLE Person;, args=[]]
[05:21:18]W: [:ignite-jdbc:integrationTest]
[2024-12-10T05:21:17,799][WARN
][%ijbst_n_0%partition-operations-15][ReplicaManager] Failed to process replica
request [request=WriteIntentSwitchReplicaRequestImpl [commit=false,
commitTimestamp=null, groupId=TablePartitionIdMessageImpl [partitionId=16,
tableId=16], timestamp=HybridTimestamp [physical=2024-12-10 05:21:17:798 +0000,
logical=1, composite=113626846186569729],
txId=0193af03-bbe4-0002-de3d-986200000001]].
[05:21:18]W: [:ignite-jdbc:integrationTest]
java.util.concurrent.CompletionException: java.lang.AssertionError:
txId=0193af03-bbe4-0002-de3d-986200000001, tableId=16, catalogVersion=8
{noformat}
What can be done?
* Improve error message in
*org.apache.ignite.internal.table.distributed.TableUtils#indexIdsAtRwTxBeginTs*.
* Somewhere set the *catalogVersion* at the moment of the transaction start/
and then work with it everywhere, but here we need to check whether there will
be problems if we create new indexes in which we could write data.
Improve error logging:
> Investigate the flaky ItJdbcBatchSelfTest
> -----------------------------------------
>
> Key: IGNITE-23926
> URL: https://issues.apache.org/jira/browse/IGNITE-23926
> Project: Ignite
> Issue Type: Task
> Reporter: Kirill Tkalenko
> Assignee: Kirill Tkalenko
> Priority: Major
> Labels: ignite-3
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In
> [JDBC|https://ci.ignite.apache.org/viewLog.html?buildId=8680682&tab=buildResultsDiv&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleJdbc]
> there is error
> {code:java}
> java.util.concurrent.CompletionException: java.lang.AssertionError:
> txId=01939e8e-3d9a-001c-de3d-986200000001, tableId=16, catalogVersion=8
> {code}
> for *ItJdbcBatchSelfTest*, we need to find out what the reason is.
> StackTrace:
> {noformat}
> [00:39:02]W: [:ignite-jdbc:integrationTest]
> java.util.concurrent.CompletionException: java.lang.AssertionError:
> txId=01939e8e-3d9a-001c-de3d-986200000001, tableId=16, catalogVersion=8
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:687)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:662)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2168)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processWriteIntentSwitchAction$62(PartitionReplicaListener.java:1807)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1187)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2309)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processWriteIntentSwitchAction(PartitionReplicaListener.java:1802)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequest(PartitionReplicaListener.java:763)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processOperationRequestWithTxRwCounter(PartitionReplicaListener.java:3977)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.processRequest(PartitionReplicaListener.java:536)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$invoke$1(PartitionReplicaListener.java:473)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1187)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2309)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.invoke(PartitionReplicaListener.java:473)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.replicator.ReplicaImpl.processRequest(ReplicaImpl.java:155)
> ~[ignite-replicator-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.replicator.ReplicaManager.handleReplicaRequest(ReplicaManager.java:450)
> ~[ignite-replicator-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.replicator.ReplicaManager.onReplicaMessageReceived(ReplicaManager.java:383)
> ~[ignite-replicator-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.DefaultMessagingService.sendToSelf(DefaultMessagingService.java:388)
> ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.DefaultMessagingService.invoke0(DefaultMessagingService.java:316)
> ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.DefaultMessagingService.invoke(DefaultMessagingService.java:245)
> ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.wrapper.JumpToExecutorByConsistentIdAfterSend.invoke(JumpToExecutorByConsistentIdAfterSend.java:97)
> ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.MessagingService.invoke(MessagingService.java:198)
> ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.replicator.ReplicaService.sendToReplicaRaw(ReplicaService.java:138)
> ~[ignite-replicator-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.replicator.ReplicaService.sendToReplica(ReplicaService.java:122)
> ~[ignite-replicator-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.replicator.ReplicaService.invoke(ReplicaService.java:289)
> ~[ignite-replicator-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxMessageSender.switchWriteIntents(TxMessageSender.java:102)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.WriteIntentSwitchProcessor.switchLocalWriteIntents(WriteIntentSwitchProcessor.java:77)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxCleanupRequestHandler.processTxCleanup(TxCleanupRequestHandler.java:130)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxCleanupRequestHandler.lambda$start$0(TxCleanupRequestHandler.java:108)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.TrackableNetworkMessageHandler.onReceived(TrackableNetworkMessageHandler.java:52)
> ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.DefaultMessagingService.sendToSelf(DefaultMessagingService.java:388)
> ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.DefaultMessagingService.invoke0(DefaultMessagingService.java:316)
> ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.DefaultMessagingService.invoke(DefaultMessagingService.java:245)
> ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.wrapper.JumpToExecutorByConsistentIdAfterSend.invoke(JumpToExecutorByConsistentIdAfterSend.java:97)
> ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.network.MessagingService.invoke(MessagingService.java:198)
> ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxMessageSender.cleanup(TxMessageSender.java:131)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxCleanupRequestSender.sendCleanupMessageWithRetries(TxCleanupRequestSender.java:240)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxCleanupRequestSender.cleanupPartitions(TxCleanupRequestSender.java:226)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxCleanupRequestSender.cleanup(TxCleanupRequestSender.java:165)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.tx.impl.TxManagerImpl.cleanup(TxManagerImpl.java:826)
> ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$finishAndCleanup$58(PartitionReplicaListener.java:1692)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$applyCmdWithRetryOnSafeTimeReorderException$114(PartitionReplicaListener.java:2680)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:614)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:653)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.lang.Thread.run(Thread.java:833) [?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] Caused by:
> java.lang.AssertionError: txId=01939e8e-3d9a-001c-de3d-986200000001,
> tableId=16, catalogVersion=8
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.TableUtils.indexIdsAtRwTxBeginTs(TableUtils.java:58)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.indexIdsAtRwTxBeginTs(PartitionReplicaListener.java:4045)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.applyWriteIntentSwitchCommand(PartitionReplicaListener.java:1882)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processWriteIntentSwitchAction$61(PartitionReplicaListener.java:1809)
> ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] at
> java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:684)
> ~[?:?]
> [00:39:02]W: [:ignite-jdbc:integrationTest] ... 53 more
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)