Andrey Khitrin created IGNITE-18001: ---------------------------------------
Summary: Ignite node may become unavailable after some play with SQL Key: IGNITE-18001 URL: https://issues.apache.org/jira/browse/IGNITE-18001 Project: Ignite Issue Type: Bug Affects Versions: 3.0.0-beta1 Reporter: Andrey Khitrin Attachments: ignite3db-0.log, ignite3db-1.log Steps to reproduce: # Start AI3 node, init cluster # Connect to node via Ignite3 CLI # Open SQL console in Ignite3 CLI # Play with SQL queries: create tables, invoke select queries, drop tables, so on. Probably, this step is not needed. Probably, it may be important to perform some queries with errors. # Wait for some time (30-60 mins) having SQL console open. # Try to execute new query after the pause. It {*}hangs{*}. In DB log, a lot of the following errors occur (some of them could occur even before step 6 above): {code} 2022-10-27 18:15:57:391 +0400 [WARNING][%defaultNode%Raft-Group-Client-11][RaftGroupServiceImpl] Recoverable error duri ng the request type=ActionRequestImpl occurred (will be retried on the randomly selected node): java.util.concurrent.TimeoutException at java.base/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecut or.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 2022-10-27 18:17:21:376 +0400 [INFO][%defaultNode%checkpoint-thread-1][Checkpoint] Skipping checkpoint (no pages were m odified) [checkpointBeforeWriteLockTime=0ms, checkpointWriteLockWait=0ms, checkpointListenersExecuteTime=0ms, checkpoin tWriteLockHoldTime=0ms, reason='timeout'] {code} After attempt to restart node (`./bin/ignite3db.sh stop && ./bin/ignite3db.sh start`) another stacktrace occurs in log: {code}2022-10-27 18:24:06:489 +0400 [INFO][ForkJoinPool.commonPool-worker-9][ClientHandlerModule] Thin client protocol started successfully[port=10800] 2022-10-27 18:24:06:490 +0400 [INFO][ForkJoinPool.commonPool-worker-9][IgniteImpl] Components started, performing recovery 2022-10-27 18:24:06:868 +0400 [INFO][ForkJoinPool.commonPool-worker-9][ConfigurationRegistry] Failed to notify configuration listener java.util.NoSuchElementException: table.tables.ee9c42e0-9b96-4164-b13b-8bec99d3171a.assignments at org.apache.ignite.internal.configuration.util.ConfigurationUtil.findEx(ConfigurationUtil.java:852) at org.apache.ignite.internal.configuration.ConfigurationChanger.getLatest(ConfigurationChanger.java:439) at org.apache.ignite.internal.configuration.direct.DirectPropertyProxy.value(DirectPropertyProxy.java:65) at org.apache.ignite.internal.table.distributed.TableManager.updateAssignmentInternal(TableManager.java:652) at org.apache.ignite.internal.table.distributed.TableManager.onUpdateAssignments(TableManager.java:616) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:492) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$2.visitLeafNode(ConfigurationNotifier.java:374) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$2.visitLeafNode(ConfigurationNotifier.java:370) at org.apache.ignite.internal.schema.configuration.TableNode.traverseChildren(Unknown Source) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyListeners(ConfigurationNotifier.java:370) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$2.visitNamedListNode(ConfigurationNotifier.java:460) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$2.visitNamedListNode(ConfigurationNotifier.java:370) at org.apache.ignite.internal.schema.configuration.TablesNode.traverseChildren(Unknown Source) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyListeners(ConfigurationNotifier.java:370) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyListeners(ConfigurationNotifier.java:88) at org.apache.ignite.internal.configuration.ConfigurationRegistry$2.visitInnerNode(ConfigurationRegistry.java:310) at org.apache.ignite.internal.configuration.ConfigurationRegistry$2.visitInnerNode(ConfigurationRegistry.java:292) at org.apache.ignite.internal.configuration.SuperRoot.traverseChildren(SuperRoot.java:103) at org.apache.ignite.internal.configuration.ConfigurationRegistry.notificator(ConfigurationRegistry.java:292) at org.apache.ignite.internal.configuration.ConfigurationChanger.notifyCurrentConfigurationListeners(ConfigurationChanger.java:616) at org.apache.ignite.internal.configuration.ConfigurationRegistry.notifyCurrentConfigurationListeners(ConfigurationRegistry.java:355) at org.apache.ignite.internal.app.IgniteImpl.notifyConfigurationListeners(IgniteImpl.java:714) at org.apache.ignite.internal.app.IgniteImpl.lambda$start$6(IgniteImpl.java:538) at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610) at java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:791) at java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183) 2022-10-27 18:24:06:869 +0400 [INFO][ForkJoinPool.commonPool-worker-11][IgniteImpl] Checking revision on recovery [targetRevision=116, appliedRevision=115, acceptableDifference=100] 2022-10-27 18:24:06:871 +0400 [INFO][ForkJoinPool.commonPool-worker-11][Cluster] [default:defaultNode:6c69de62-774d-46db-9278-b7b037011747@127.0.1.1:3344][doShutdown] Shutting down 2022-10-27 18:24:06:878 +0400 [INFO][ForkJoinPool.commonPool-worker-11][Cluster] [default:defaultNode:6c69de62-774d-46db-9278-b7b037011747@127.0.1.1:3344][leaveCluster] Leaving cluster {code} And node status is show as "unavailable" {code} ignite3cli-3.0.0-SNAPSHOT:$ ./bin/ignite3 node status Node unavailable Could not connect to node with URL http://localhost:10300 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)