[
https://issues.apache.org/jira/browse/RATIS-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766787#comment-17766787
]
Sammi Chen commented on RATIS-1891:
-----------------------------------
[~Nicholas Niu] and I also discussed how to recover from the situation once the
raft log gap happens. Once it's happened, it cannot be recovered automatically
now. When the problem in RATIS-1887 happened, [~Nicholas Niu] deleted the raft
log files, and copied the rocksdb from leader to other followers, then start
the SCM service. But later, it's found that some new container records are
missing in SCM.
So [~szetszwo], we are thinking about a CLI tool which accepts a (term, index),
and the path of the raft logs. It manually truncate the raft logs under this
path to the specified (term, index). The tool will be run on each raft peer
server, to make the raft log consistent across all there peers. After that,
start the raft service(Here it could be OM or SCM in Ozone). How do you think
about this approach? Or do you have a better idea about this?
> Gap between logs cause service startup failure
> ----------------------------------------------
>
> Key: RATIS-1891
> URL: https://issues.apache.org/jira/browse/RATIS-1891
> Project: Ratis
> Issue Type: Bug
> Reporter: Sammi Chen
> Priority: Critical
>
> This is the second raft gap problem reported by Guo Hao.
> {code:java}
> 2023-09-08 18:53:47,590 [Listener at test17/9860] ERROR
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SCM start
> failed with exception
> java.util.concurrent.CompletionException: java.lang.IllegalStateException:
> gap between start index 375 and first entry to append 377
> at
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> at
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> at
> java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1284)
> at
> java.util.concurrent.CompletableFuture$BiRelay.tryFire(CompletableFuture.java:1270)
> at
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
> at
> org.apache.ratis.util.ConcurrentUtils.accept(ConcurrentUtils.java:191)
> at
> org.apache.ratis.util.ConcurrentUtils.lambda$null$4(ConcurrentUtils.java:180)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: gap between start index 375 and
> first entry to append 377
> at
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> at
> org.apache.ratis.server.raftlog.segmented.LogSegment.append(LogSegment.java:313)
> at
> org.apache.ratis.server.raftlog.segmented.LogSegment.lambda$loadSegment$2(LogSegment.java:165)
> at
> org.apache.ratis.server.raftlog.segmented.LogSegment.readSegmentFile(LogSegment.java:138)
> at
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadSegment(LogSegment.java:164)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.loadSegment(SegmentedRaftLogCache.java:381)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:241)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:214)
> at
> org.apache.ratis.server.raftlog.RaftLogBase.open(RaftLogBase.java:251)
> at
> org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:239)
> at
> org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:220)
> at
> org.apache.ratis.server.impl.ServerState.lambda$new$5(ServerState.java:161)
> at
> org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
> at
> org.apache.ratis.server.impl.ServerState.initialize(ServerState.java:177)
> at
> org.apache.ratis.server.impl.RaftServerImpl.start(RaftServerImpl.java:338)
> at
> org.apache.ratis.util.ConcurrentUtils.accept(ConcurrentUtils.java:188)
> ... 4 more
> {code}
> # The gap server directory
> {code:java}
> $ ll
> /home/work/ozone/scm.ha.ratis-storage-test4/1d823d9f-3e87-4790-85fc-f1a93f7845e5/current/
> total 4120
> -rw-rw-r-- 1 work work 14567 Sep 8 18:30 log_291-374
> -rw-rw-r-- 1 work work 4194304 Sep 8 18:30 log_inprogress_375
> -rw-rw-r-- 1 work work 50 Sep 8 18:30 raft-meta
> -rw-rw-r-- 1 work work 242 Sep 8 17:29 raft-meta.conf
> {code}
>
> The other two
> {code:java}
> $ ll
> total 4168
> -rw-rw-r-- 1 work work 95 Sep 8 12:13 log_0-0
> -rw-rw-r-- 1 work work 39285 Sep 8 17:30 log_1-290
> -rw-rw-r-- 1 work work 14567 Sep 8 17:35 log_291-374
> -rw-rw-r-- 1 work work 271 Sep 8 17:50 log_375-376
> -rw-rw-r-- 1 work work 4194304 Sep 8 19:01 log_inprogress_377
> -rw-rw-r-- 1 work work 86 Sep 8 18:29 raft-meta
> -rw-rw-r-- 1 work work 242 Sep 8 18:29 raft-meta.conf
> {code}
> {code:java}
> $ ll
> total 4168
> -rw-rw-r-- 1 work work 95 Sep 8 13:15 log_0-0
> -rw-rw-r-- 1 work work 39285 Sep 8 17:29 log_1-290
> -rw-rw-r-- 1 work work 14567 Sep 8 17:35 log_291-374
> -rw-rw-r-- 1 work work 271 Sep 8 18:29 log_375-376
> -rw-rw-r-- 1 work work 4194304 Sep 8 19:01 log_inprogress_377
> -rw-rw-r-- 1 work work 86 Sep 8 18:29 raft-meta
> -rw-rw-r-- 1 work work 242 Sep 8 18:29 raft-meta.conf
> {code}
>
> Related Configurations:
> {code:java}
> <property>
> <name>hdds.ratis.raft.server.log.unsafe-flush.enabled</name>
> <value>false</value>
> </property>
> <property>
> <name>hdds.ratis.raft.server.log.async-flush.enabled</name>
> <value>false</value>
> </property>
> {code}
>
> The scene in which the GAP occurs this time is as follows:
> 1. shutdown scm, shutdown more than 60s timeout kill -9
> 2. restart scm, this error occurs
--
This message was sent by Atlassian Jira
(v8.20.10#820010)