[ 
https://issues.apache.org/jira/browse/RATIS-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766787#comment-17766787
 ] 

Sammi Chen edited comment on RATIS-1891 at 9/19/23 1:21 PM:
------------------------------------------------------------

[~Nicholas Niu] and I also discussed how to recover from the situation once the 
raft log gap happens. Once it's happened, it cannot be recovered automatically 
now and service cannot startup successfully.  When the problem in RATIS-1887 
happened, [~Nicholas Niu] deleted the raft log files, and copied the rocksdb 
from leader to other followers, then start the SCM service. But later, it's 
found that some new container records are missing in SCM. 
So [~szetszwo], we are thinking about a CLI tool which accepts a (term, index), 
and the path of the raft log files. It truncates the raft logs under this path 
to the specified (term, index). The tool will be run on each raft peer server, 
to make the raft log consistent across all there peers. After that, start the 
raft service(Here it could be OM or SCM in Ozone). How do you think about this 
approach? Or do you have a better idea about this? 


was (Author: sammi):
[~Nicholas Niu] and I also discussed how to recover from the situation once the 
raft log gap happens. Once it's happened, it cannot be recovered automatically 
now and service cannot startup successfully.  When the problem in RATIS-1887 
happened, [~Nicholas Niu] deleted the raft log files, and copied the rocksdb 
from leader to other followers, then start the SCM service. But later, it's 
found that some new container records are missing in SCM. 
So [~szetszwo], we are thinking about a CLI tool which accepts a (term, index), 
and the path of the raft log files. It manually truncate the raft logs under 
this path to the specified (term, index). The tool will be run on each raft 
peer server, to make the raft log consistent across all there peers. After 
that, start the raft service(Here it could be OM or SCM in Ozone). How do you 
think about this approach? Or do you have a better idea about this? 

> Gap between logs cause service startup failure
> ----------------------------------------------
>
>                 Key: RATIS-1891
>                 URL: https://issues.apache.org/jira/browse/RATIS-1891
>             Project: Ratis
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Priority: Critical
>
> This is the second raft gap problem reported by Guo Hao.
> {code:java}
> 2023-09-08 18:53:47,590 [Listener at test17/9860] ERROR 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SCM start 
> failed with exception
> java.util.concurrent.CompletionException: java.lang.IllegalStateException: 
> gap between start index 375 and first entry to append 377
>         at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>         at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>         at 
> java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1284)
>         at 
> java.util.concurrent.CompletableFuture$BiRelay.tryFire(CompletableFuture.java:1270)
>         at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>         at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>         at 
> org.apache.ratis.util.ConcurrentUtils.accept(ConcurrentUtils.java:191)
>         at 
> org.apache.ratis.util.ConcurrentUtils.lambda$null$4(ConcurrentUtils.java:180)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: gap between start index 375 and 
> first entry to append 377
>         at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
>         at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.append(LogSegment.java:313)
>         at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.lambda$loadSegment$2(LogSegment.java:165)
>         at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.readSegmentFile(LogSegment.java:138)
>         at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadSegment(LogSegment.java:164)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.loadSegment(SegmentedRaftLogCache.java:381)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:241)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:214)
>         at 
> org.apache.ratis.server.raftlog.RaftLogBase.open(RaftLogBase.java:251)
>         at 
> org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:239)
>         at 
> org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:220)
>         at 
> org.apache.ratis.server.impl.ServerState.lambda$new$5(ServerState.java:161)
>         at 
> org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>         at 
> org.apache.ratis.server.impl.ServerState.initialize(ServerState.java:177)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.start(RaftServerImpl.java:338)
>         at 
> org.apache.ratis.util.ConcurrentUtils.accept(ConcurrentUtils.java:188)
>         ... 4 more 
> {code}
> # The gap server directory 
> {code:java}
> $ ll 
> /home/work/ozone/scm.ha.ratis-storage-test4/1d823d9f-3e87-4790-85fc-f1a93f7845e5/current/
> total 4120
> -rw-rw-r-- 1 work work   14567 Sep  8 18:30 log_291-374
> -rw-rw-r-- 1 work work 4194304 Sep  8 18:30 log_inprogress_375
> -rw-rw-r-- 1 work work      50 Sep  8 18:30 raft-meta
> -rw-rw-r-- 1 work work     242 Sep  8 17:29 raft-meta.conf 
> {code}
>  
> The other two
> {code:java}
> $ ll
> total 4168
> -rw-rw-r-- 1 work work      95 Sep  8 12:13 log_0-0
> -rw-rw-r-- 1 work work   39285 Sep  8 17:30 log_1-290
> -rw-rw-r-- 1 work work   14567 Sep  8 17:35 log_291-374
> -rw-rw-r-- 1 work work     271 Sep  8 17:50 log_375-376
> -rw-rw-r-- 1 work work 4194304 Sep  8 19:01 log_inprogress_377
> -rw-rw-r-- 1 work work      86 Sep  8 18:29 raft-meta
> -rw-rw-r-- 1 work work     242 Sep  8 18:29 raft-meta.conf 
> {code}
> {code:java}
> $ ll
> total 4168
> -rw-rw-r-- 1 work work      95 Sep  8 13:15 log_0-0
> -rw-rw-r-- 1 work work   39285 Sep  8 17:29 log_1-290
> -rw-rw-r-- 1 work work   14567 Sep  8 17:35 log_291-374
> -rw-rw-r-- 1 work work     271 Sep  8 18:29 log_375-376
> -rw-rw-r-- 1 work work 4194304 Sep  8 19:01 log_inprogress_377
> -rw-rw-r-- 1 work work      86 Sep  8 18:29 raft-meta
> -rw-rw-r-- 1 work work     242 Sep  8 18:29 raft-meta.conf 
> {code}
>  
> Related Configurations:
> {code:java}
> <property>
> <name>hdds.ratis.raft.server.log.unsafe-flush.enabled</name>
> <value>false</value>
> </property>
> <property>
> <name>hdds.ratis.raft.server.log.async-flush.enabled</name>
> <value>false</value>
> </property> 
> {code}
>  
> The scene in which the GAP occurs this time is as follows:
> 1. shutdown scm, shutdown more than 60s timeout kill -9
> 2. restart scm, this error occurs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to