Wei-Chiu Chuang created RATIS-1100:
--------------------------------------

             Summary: Make raft log gap error easier to troubleshoot
                 Key: RATIS-1100
                 URL: https://issues.apache.org/jira/browse/RATIS-1100
             Project: Ratis
          Issue Type: Improvement
    Affects Versions: 1.0.0
            Reporter: Wei-Chiu Chuang


Upon restart, Ozone Manager won't start and emitted the following error:

 
{code:java}
2020-10-19 12:04:10,639 INFO 
org.apache.ratis.server.raftlog.segmented.LogSegment: Successfully read 7553 
entries from segment file 
/var/lib/hadoop-ozone/fake_om/ratis/1b9ac7ae-cd52-3ab1-8089-942f8267f22a/current/log_25657965-25665517
2020-10-19 12:04:10,639 ERROR org.apache.hadoop.ozone.om.OzoneManagerStarter: 
OM start failed with exception
java.io.IOException: java.lang.IllegalStateException
 at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
 at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
 at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:289)
 at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:301)
 at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.start(OzoneManagerRatisServer.java:367)
 at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1138)
 at 
org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:125)
 at 
org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:79)
 at 
org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:67)
 at 
org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:38)
 at picocli.CommandLine.executeUserObject(CommandLine.java:1933)
 at picocli.CommandLine.access$1100(CommandLine.java:145)
 at 
picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
 at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
 at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
 at 
picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2152)
 at picocli.CommandLine.parseWithHandlers(CommandLine.java:2530)
 at picocli.CommandLine.parseWithHandler(CommandLine.java:2465)
 at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:96)
 at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:87)
 at 
org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:51)
Caused by: java.lang.IllegalStateException
 at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:36)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.validateAdding(SegmentedRaftLogCache.java:400)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.addSegment(SegmentedRaftLogCache.java:405)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.loadSegment(SegmentedRaftLogCache.java:367)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:249)
 at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:217)
 at org.apache.ratis.server.raftlog.RaftLog.open(RaftLog.java:276)
 at org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:191)
 at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:121)
 at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:123)
 at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:213){code}
 

Looking at the code and checking the ratis log directory, I realized there is a 
gap in ratis log files (7659964 vs 25657965). 

 

File this Jira to make this error message easier to understand, without the 
need to look at the code.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to