[
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343039#comment-17343039
]
Kirill Tkalenko commented on IGNITE-14684:
------------------------------------------
[~ibessonov] Please make code review.
> Stopping node at the end of checkpoint can cause "Critical system error"
> ------------------------------------------------------------------------
>
> Key: IGNITE-14684
> URL: https://issues.apache.org/jira/browse/IGNITE-14684
> Project: Ignite
> Issue Type: Bug
> Components: persistence
> Reporter: Maria Makedonskaya
> Assignee: Kirill Tkalenko
> Priority: Major
> Fix For: 2.11
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Checkpoint listener
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
> which trigger at the end of checkpoint process can not take checkpoint read
> lock during node stopping.
> Run test(see exception in
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
> Critical system error detected. Will be handled accordingly to configured
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED,
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to
> perform cache update: node is stopping.]]
> class org.apache.ignite.IgniteException: Failed to perform cache update: node
> is stopping.
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
> at
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
> at
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to
> perform cache update: node is stopping.
> ... 9 more
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)