[ https://issues.apache.org/jira/browse/IGNITE-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vyacheslav Koptilin reassigned IGNITE-21619: -------------------------------------------- Assignee: Alexander Lapin > "Failed to get the primary replica" after massive data insert and node restart > ------------------------------------------------------------------------------ > > Key: IGNITE-21619 > URL: https://issues.apache.org/jira/browse/IGNITE-21619 > Project: Ignite > Issue Type: Bug > Components: sql > Affects Versions: 3.0.0-beta2 > Reporter: Andrey Khitrin > Assignee: Alexander Lapin > Priority: Major > Labels: ignite-3, sql > Attachments: ignite-config.conf, ignite3db-0.log > > > Steps to reproduce: > 1. Start a 1-node cluster. > 2 Create several tables (5, for example) in aipersist zone. > 3. Fill these tables with some data (1000 rows each, for example). > 4. Verify that data is accessible via SQL. > 5. Restart a node. > 6. Try to fetch the same data again. > Expected result: we could fetch data. > Actual result: data is inaccessible. > Trace on the client side: > {code} > java.sql.SQLException: Failed to get the primary replica > [tablePartitionId=6_part_1] > at > org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57) > at > org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154) > at > org.apache.ignite.internal.jdbc.JdbcStatement.executeQuery(JdbcStatement.java:111) > ... > {code} > Trace in node log (attached): > {code} > 2024-02-28 12:36:34:807 +0500 > [INFO][%ClusterFailoverTest_cluster_0%sql-execution-pool-0][JdbcQueryEventHandlerImpl] > Exception while executing query [query=select sum(k1) from failoverTest00] > org.apache.ignite.sql.SqlException: IGN-CMN-65535 > TraceId:8d366905-a4bb-4333-b0b3-c647a1cf943f Failed to get the primary > replica [tablePartitionId=6_part_1] > at > org.apache.ignite.internal.lang.SqlExceptionMapperUtil.mapToPublicSqlException(SqlExceptionMapperUtil.java:61) > at > org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:180) > at > org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.handleError(AsyncSqlCursorImpl.java:157) > at > org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$2(AsyncSqlCursorImpl.java:96) > at > java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) > at > java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$18(ExecutionServiceImpl.java:864) > at > java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) > at > java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > at > java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) > at > org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:83) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 > TraceId:8d366905-a4bb-4333-b0b3-c647a1cf943f Failed to get the primary > replica [tablePartitionId=6_part_1] > at > org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:117) > at > org.apache.ignite.internal.lang.SqlExceptionMapperUtil.mapToPublicSqlException(SqlExceptionMapperUtil.java:51) > ... 15 more > Caused by: org.apache.ignite.internal.lang.IgniteInternalException: > IGN-PLACEMENTDRIVER-1 TraceId:8d366905-a4bb-4333-b0b3-c647a1cf943f Failed to > get the primary replica [tablePartitionId=6_part_1] > at > org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384) > at > org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446) > at > org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384) > at > org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:402) > at > java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) > at > java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > at > java.base/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > ... 3 more > Caused by: java.util.concurrent.CompletionException: > org.apache.ignite.internal.placementdriver.PrimaryReplicaAwaitTimeoutException: > IGN-PLACEMENTDRIVER-1 TraceId:8d366905-a4bb-4333-b0b3-c647a1cf943f The > primary replica await timed out [replicationGroupId=6_part_1, > referenceTimestamp=HybridTimestamp [physical=2024-02-28 12:36:04:780 +0500, > logical=0, composite=112007955400622080], currentLease=Lease > [leaseholder=ClusterFailoverTest_cluster_0, > leaseholderId=ee143400-ca69-401f-9ff8-6e1cc7e5b394, accepted=false, > startTime=HybridTimestamp [physical=2024-02-28 12:36:04:048 +0500, > logical=115, composite=112007955352649843], expirationTime=HybridTimestamp > [physical=2024-02-28 12:38:04:048 +0500, logical=0, > composite=112007963216969728], prolongable=false, > replicationGroupId=6_part_1]] > at > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314) > at > java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319) > at > java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990) > at > java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970) > ... 9 more > Caused by: > org.apache.ignite.internal.placementdriver.PrimaryReplicaAwaitTimeoutException: > IGN-PLACEMENTDRIVER-1 TraceId:8d366905-a4bb-4333-b0b3-c647a1cf943f The > primary replica await timed out [replicationGroupId=6_part_1, > referenceTimestamp=HybridTimestamp [physical=2024-02-28 12:36:04:780 +0500, > logical=0, composite=112007955400622080], currentLease=Lease > [leaseholder=ClusterFailoverTest_cluster_0, > leaseholderId=ee143400-ca69-401f-9ff8-6e1cc7e5b394, accepted=false, > startTime=HybridTimestamp [physical=2024-02-28 12:36:04:048 +0500, > logical=115, composite=112007955352649843], expirationTime=HybridTimestamp > [physical=2024-02-28 12:38:04:048 +0500, logical=0, > composite=112007963216969728], prolongable=false, > replicationGroupId=6_part_1]] > at > org.apache.ignite.internal.placementdriver.leases.LeaseTracker.lambda$awaitPrimaryReplica$5(LeaseTracker.java:276) > at > java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) > ... 10 more > Caused by: java.util.concurrent.TimeoutException > ... 7 more > {code} > Issue is *not* reproducible in the following configurations: > * aipersist with 2 nodes > * rocksdb with 1 or 2 nodes -- This message was sent by Atlassian Jira (v8.20.10#820010)