[
https://issues.apache.org/jira/browse/IGNITE-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518097#comment-16518097
]
ASF GitHub Bot commented on IGNITE-8797:
----------------------------------------
GitHub user alex-plekhanov opened a pull request:
https://github.com/apache/ignite/pull/4229
IGNITE-8797 cleanPersistenceDir() before stopAllGrids()
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/alex-plekhanov/ignite ignite-8797
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/4229.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4229
----
commit d36cf94ba4bdf3f7f5bc27dce18a6a915fda55ac
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-20T12:35:08Z
IGNITE-8797 cleanPersistenceDir() before stopAllGrids()
----
> Error during writeCheckpointEntry is not passed to failure handler during
> checkpoint finish
> -------------------------------------------------------------------------------------------
>
> Key: IGNITE-8797
> URL: https://issues.apache.org/jira/browse/IGNITE-8797
> Project: Ignite
> Issue Type: Bug
> Reporter: Alexey Goncharuk
> Assignee: Aleksey Plekhanov
> Priority: Major
> Labels: MakeTeamcityGreenAgain
>
> I observed the following failure in Cache 3 suite:
> {code}
> [13:10:55]W: [org.apache.ignite:ignite-core] [2018-06-14
> 10:10:55,509][ERROR][db-checkpoint-thread-#138910%paged.PageEvictionMultinodeMixedRegionsTest2%][GridCacheDatabaseSharedManager]
> Failed to create checkpoint.
> [13:10:55]W: [org.apache.ignite:ignite-core] class
> org.apache.ignite.internal.processors.cache.persistence.file.PersistentStorageIOException:
> Failed to write checkpoint entry [ptr=FileWALPointer [idx=0, fileOff=219747,
> len=1947], cpTs=1528971054548, cpId=d8b42759-ca5e-4613-b091-ed0356b3915d,
> type=END]
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2757)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.access$8100(GridCacheDatabaseSharedManager.java:178)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3716)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3277)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3053)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> java.lang.Thread.run(Thread.java:748)
> [13:10:55]W: [org.apache.ignite:ignite-core] Caused by:
> java.nio.file.NoSuchFileException:
> /data/teamcity/work/c182b70f2dfa6507/work/db/node03-c5dcc243-fc3c-4b2f-8002-81e88d8cff7d/cp/1528971054548-d8b42759-ca5e-4613-b091-ed0356b3915d-END.bin.tmp
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> java.nio.file.Files.move(Files.java:1395)
> [13:10:55]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2752)
> [13:10:55]W: [org.apache.ignite:ignite-core] ... 6 more
> [13:10:55]W: [org.apache.ignite:ignite-core] [2018-06-14
> 10:10:55,509][ERROR][db-checkpoint-thread-#138914%paged.PageEvictionMultinodeMixedRegionsTest3%][GridCacheDatabaseSharedManager]
> Failed to create checkpoint.
> {code}
> I see two issues here:
> 1) Some concurrent process is removing the work folder which results in the
> exception above
> 2) The checkpoint exception is not passed to the failure handler. This is due
> to a catch {{// TODO-ignite-db how to handle exception?}} in
> {{Checkpointer}}, which yields an uncompleted checkpoint future.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)