[jira] [Updated] (IGNITE-16423) [%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to close writer

Vyacheslav Koptilin (Jira) Tue, 22 Feb 2022 01:13:04 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vyacheslav Koptilin updated IGNITE-16423:
-----------------------------------------
    Description: 
{noformat}
2022-01-28 13:17:41:189 +0300 
[INFO][%my-first-node%JRaft-Common-Executor-1][LocalSnapshotStorage] Deleting 
snapshot 
/home/prom1se/GG/apache/ignite-3/modules/cli/target/ignite-work/data/my-first-node/metastorage_raft_group_127.0.1.1_3344/snapshot/temp.
2022-01-28 13:17:41:192 +0300 
[ERROR][%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to 
close writer
java.io.IOException
        at 
org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotStorage.close(LocalSnapshotStorage.java:242)
        at 
org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:93)
        at 
org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:88)
        at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:387)
        at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135)
        at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
{noformat}

And after that, the timeout error repeats every 10 seconds.

The problem is reproducible stably, but each time needs a different time.  In 
attachment 2 different tests: one without any activity and one hour wait , 
second after SQL table creation 

  was:
{code:java}
2022-01-28 13:17:41:192 +0300 
[ERROR][%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to 
close writerjava.io.IOException    at 
org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotStorage.close(LocalSnapshotStorage.java:242)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:93)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:88)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:387)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)2022-01-28 13:17:41:202 
+0300 
[ERROR][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][StateMachineAdapter]
 Encountered an error=Status[EIO<1014>: Fail to save snapshot.] on StateMachine 
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine,
 it's highly recommended to implement this method as raft stops working since 
some error occurs, you should figure out the cause and repair or remove this 
node.Error [type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to save 
snapshot.]]    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.reportError(SnapshotExecutorImpl.java:682)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:406)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)2022-01-28 13:17:41:203 
+0300 
[WARNING][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][NodeImpl] Node 
<metastorage_raft_group/127.0.1.1:3344> got error: Error 
[type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to save 
snapshot.]].2022-01-28 13:17:41:203 +0300 
[WARNING][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][FSMCallerImpl] 
FSMCaller already in error status, ignore new errorError 
[type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to save snapshot.]]    
at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.reportError(SnapshotExecutorImpl.java:682)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:406)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135)
    at 
org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)2022-01-28 13:17:41:205 
+0300 
[INFO][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][ReplicatorGroupImpl]
 Fail to find the next candidate.2022-01-28 13:17:41:205 +0300 
[INFO][%my-first-node%JRaft-FSMCaller-Disruptor-_stripe_10-0][StateMachineAdapter]
 onLeaderStop: status=Status[EBADNODE<10009>: Raft node(leader or candidate) is 
in error.].2022-01-28 13:17:51:262 +0300 
[ERROR][Thread-72][MetaStorageServiceImpl] Unexpected exceptionclass 
org.apache.ignite.lang.IgniteInternalException: 
java.util.concurrent.TimeoutException    at 
org.apache.ignite.internal.metastorage.client.CursorImpl$InnerIterator.hasNext(CursorImpl.java:121)
    at 
org.apache.ignite.internal.metastorage.client.MetaStorageServiceImpl$WatchProcessor$Watcher.run(MetaStorageServiceImpl.java:476)Caused
 by: java.util.concurrent.ExecutionException: 
java.util.concurrent.TimeoutException    at 
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
    at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
    at 
org.apache.ignite.internal.metastorage.client.CursorImpl$InnerIterator.hasNext(CursorImpl.java:113)
    ... 1 moreCaused by: java.util.concurrent.TimeoutException    at 
org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:502)
    at 
org.apache.ignite.raft.jraft.rpc.impl.RaftGroupServiceImpl$1.lambda$accept$2(RaftGroupServiceImpl.java:555)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)    at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829) {code}
And after that, the timeout error repeats every 10 seconds.

The problem is reproducible stably, but each time needs a different time.  In 
attachment 2 different test: one without any activity and one houre waite , 
second after sql table creation 


> [%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to close 
> writer
> -----------------------------------------------------------------------------------
>
>                 Key: IGNITE-16423
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16423
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Fedor Malchikov 
>            Assignee: Sergey Uttsel
>            Priority: Critical
>              Labels: ignite-3
>         Attachments: my-first-node(sql).log, my-first-node.log
>
>
> {noformat}
> 2022-01-28 13:17:41:189 +0300 
> [INFO][%my-first-node%JRaft-Common-Executor-1][LocalSnapshotStorage] Deleting 
> snapshot 
> /home/prom1se/GG/apache/ignite-3/modules/cli/target/ignite-work/data/my-first-node/metastorage_raft_group_127.0.1.1_3344/snapshot/temp.
> 2022-01-28 13:17:41:192 +0300 
> [ERROR][%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to 
> close writer
> java.io.IOException
>       at 
> org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotStorage.close(LocalSnapshotStorage.java:242)
>       at 
> org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:93)
>       at 
> org.apache.ignite.raft.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:88)
>       at 
> org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl.onSnapshotSaveDone(SnapshotExecutorImpl.java:387)
>       at 
> org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.continueRun(SnapshotExecutorImpl.java:135)
>       at 
> org.apache.ignite.raft.jraft.storage.snapshot.SnapshotExecutorImpl$SaveSnapshotDone.lambda$run$0(SnapshotExecutorImpl.java:131)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at java.base/java.lang.Thread.run(Thread.java:829)
> {noformat}
> And after that, the timeout error repeats every 10 seconds.
> The problem is reproducible stably, but each time needs a different time.  In 
> attachment 2 different tests: one without any activity and one hour wait , 
> second after SQL table creation 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (IGNITE-16423) [%my-first-node%JRaft-Common-Executor-1][SnapshotExecutorImpl] Fail to close writer

Reply via email to