[ 
https://issues.apache.org/jira/browse/RATIS-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764790#comment-17764790
 ] 

Tsz-wo Sze commented on RATIS-1887:
-----------------------------------

bq. ...  Found a gap between logs: the last log segment log-784848_809981 ended 
at 809981 but the next log segment log-822560_856038 started at 822560

Is it the case that the segment log-784848_809981 mentioned above was indeed 
the file log-784848_822559?  It seems that the file was truncated but not moved.

{code}
        LOG.info("{}: Truncated log file {} to length {} and moved it to {}", 
name,
            fileToTruncate, segments.getToTruncate().getTargetLength(), 
dstFile);
{code}
Did you see this log message?  It not, the server seems to be killed right 
after 
[truncate|https://github.com/apache/ratis/blob/b8ce6d1f6ea37ed3ff9f6e888d2357fe48490567/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLogWorker.java#L664]
 but before 
[move|https://github.com/apache/ratis/blob/b8ce6d1f6ea37ed3ff9f6e888d2357fe48490567/ratis-server/src/main/java/org/apache/ratis/server/raftlog/segmented/SegmentedRaftLogWorker.java#L670].



> Gap between segement log
> ------------------------
>
>                 Key: RATIS-1887
>                 URL: https://issues.apache.org/jira/browse/RATIS-1887
>             Project: Ratis
>          Issue Type: Bug
>            Reporter: GuoHao
>            Priority: Critical
>         Attachments: image-2023-09-08-10-18-36-198.png
>
>
>  
> My version of ratis already includes this 
> pr(https://issues.apache.org/jira/browse/RATIS-1763) and I am using a new 
> raft server.
>  
> Description:
> 1. i am using ratis version 2.5.1
> 2. the application software is ozone 1.3.0 scm ha
>  
> scm error log:
> {code:java}
> Caused by: java.lang.IllegalStateException: Found a gap between logs: the 
> last log segment log-784848_809981 ended at 809981 but the next log segment 
> log-822560_856038 started at 822560
>         at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:72)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.validateAdding(SegmentedRaftLogCache.java:424)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.addSegment(SegmentedRaftLogCache.java:431)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogCache.loadSegment(SegmentedRaftLogCache.java:384)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:241)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:214)
>         at 
> org.apache.ratis.server.raftlog.RaftLogBase.open(RaftLogBase.java:251)
>         at 
> org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:239)
>         at 
> org.apache.ratis.server.impl.ServerState.initRaftLog(ServerState.java:220)
>         at 
> org.apache.ratis.server.impl.ServerState.lambda$new$5(ServerState.java:161)
>         at 
> org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>         at 
> org.apache.ratis.server.impl.ServerState.initialize(ServerState.java:177)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.start(RaftServerImpl.java:338)
>         at 
> org.apache.ratis.util.ConcurrentUtils.accept(ConcurrentUtils.java:188){code}
> segment log:
>  
> !image-2023-09-08-10-18-36-198.png!
>  
> The modification time of this segment log is greater than the modification 
> time of the file with the larger index.
> The file size of this file seems to be larger than the other files, but it's 
> not as big as the other files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to