[ 
https://issues.apache.org/jira/browse/IGNITE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403254#comment-17403254
 ] 

Ignite TC Bot commented on IGNITE-15295:
----------------------------------------

{panel:title=Branch: [pull/9325/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/9325/head] Base: [master] : New Tests 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#00008b}PDS 2{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6131505]]
* {color:#013220}IgnitePdsTestSuite2: 
CheckpointMarkerReadingErrorOnStartTest.test - PASSED{color}

{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6131534&buildTypeId=IgniteTests24Java8_RunAll]

> Server node that has an empty checkpoint file-XXX-START.bin does not start
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-15295
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15295
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Denis Chudov
>            Assignee: Denis Chudov
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When starting a server node that has an empty checkpoint file-XXX-START.bin 
> this node does not start.
> {code:java}
> 2021-06-08 
> 16:00:33.383[ERROR][Thread-19][o.a.i.i.IgniteKernal%DPL_GRID%DplGridNodeName] 
> Exception during start processors, node will be stopped and close connections
> 2java.nio.BufferUnderflowException: null
> 3        at java.nio.Buffer.nextGetIndex(Buffer.java:532)
> 4        at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:417)
> 5        at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage.readPointer(CheckpointMarkersStorage.java:301)
> 6        at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage.readCheckpointStatus(CheckpointMarkersStorage.java:218)
> 7        at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointManager.readCheckpointStatus(CheckpointManager.java:265)
> 8        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointStatus(GridCacheDatabaseSharedManager.java:1642)
> 9        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:584)
> 10        at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:2999)
> 11        at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1205)
> 12        at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2105)
> 13        at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1768)
> 14        at 
> org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1147)
> 15        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:667)
> 16        at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:593)
> 17        at org.apache.ignite.Ignition.start(Ignition.java:319)
> 18        at 
> com.sbt.ignite.factory.IgniteFactory.getOrStartIgnite(IgniteFactory.java:139)
> 19        at 
> com.sbt.ignite.factory.IgniteFactory.getOrStartIgnite(IgniteFactory.java:91)
> 20        at 
> com.sbt.ignite.manager.IgniteLifecycleManagerImpl.startIgnite(IgniteLifecycleManagerImpl.java:82)
> 21        at 
> com.sbt.ignite.manager.IgniteLifecycleManagerImpl.init(IgniteLifecycleManagerImpl.java:73)
> 22        at 
> com.sbt.dpl.gridgain.container.DPLManagerLifecycleManager.initIgniteServiceHolder(DPLManagerLifecycleManager.java:170)
> 23        at 
> com.sbt.dpl.gridgain.container.DPLManagerLifecycleManager.dplContextInit(DPLManagerLifecycleManager.java:145)
> 24        at 
> com.sbt.dpl.gridgain.container.ContainerDPLFactory.<init>(ContainerDPLFactory.java:80)
> 25        at 
> com.sbt.dpl.gridgain.springsupport.SpringDPLFactory.init(SpringDPLFactory.java:74)
> {code}
> Checkpoint marker is always fully written in the temp file first, and then 
> this file is renamed (see
> {noformat}
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage#writeCheckpointEntry(java.nio.ByteBuffer,
>  
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry,
>  
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntryType,
>  boolean){noformat}
> )
> So the root cause of this error is not clear, unless file was changed 
> somehow. We need extended information if such error will happen in future, 
> but in this case we have nothing for analysis (LFS was cleared by the 
> customer right after this error happened).
> In the same time we can’t guarantee correctness of work when checkpoint 
> markers are inconsistent. We can’t just ignore them, if they are broken, and 
> can’t recover from previous checkpoint just as simple.
> But it seems reasonable to catch all reading-related exceptions in 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage#readPointer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to