[ 
https://issues.apache.org/jira/browse/RATIS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938791#comment-17938791
 ] 

Xinhao GU commented on RATIS-2271:
----------------------------------

Hi [~szetszwo] , I want to fix this problem please.

> Leadership Loss Causes ClosedByInterruptException and NullPointerException in 
> LogAppender Thread
> ------------------------------------------------------------------------------------------------
>
>                 Key: RATIS-2271
>                 URL: https://issues.apache.org/jira/browse/RATIS-2271
>             Project: Ratis
>          Issue Type: Improvement
>          Components: gRPC, Leader
>            Reporter: Xinhao GU
>            Assignee: Sumit Agrawal
>            Priority: Major
>         Attachments: image-2025-03-25-14-40-32-711.png, 
> image-2025-03-25-14-49-11-998.png, image-2025-03-25-15-05-41-424.png, 
> image-2025-03-25-15-06-43-276.png, image-2025-03-25-15-15-50-750.png
>
>
> *After a leader loses leadership due to heartbeat timeout with a majority of 
> followers, it forcibly interrupts the {{GrpcLogAppender}} thread.*
> This abrupt termination leads to two critical exceptions during log file 
> reads:
> {{1. ClosedByInterruptException}} when initializing 
> {{{}SegmentedRaftLogInputStream{}}}.
> {code:java}
> 2025-01-18 00:29:31,472 
> [13@group-00020000000F->15-GrpcLogAppender-LogAppenderDaemon] ERROR 
> o.a.r.s.r.s.SegmentedRaftLogInputStream:107 - caught exception initializing 
> log_455-480 java.nio.channels.ClosedByInterruptException: null    at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>     at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:164)    at 
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)    at 
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)    at 
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)    at 
> java.io.BufferedInputStream.fill(BufferedInputStream.java:246)    at 
> java.io.BufferedInputStream.read1(BufferedInputStream.java:286)    at 
> java.io.BufferedInputStream.read(BufferedInputStream.java:345)    at 
> java.io.FilterInputStream.read(FilterInputStream.java:133)    at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogReader$LimitedInputStream.read(SegmentedRaftLogReader.java:96)
>     at java.io.DataInputStream.read(DataInputStream.java:149)    at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogReader.verifyHeader(SegmentedRaftLogReader.java:172)
>     at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogInputStream.init(SegmentedRaftLogInputStream.java:86)
>     at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogInputStream.nextEntry(SegmentedRaftLogInputStream.java:105)
>     at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.readSegmentFile(LogSegment.java:132)
>     at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:238)
>     at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:348)
>     at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:296)
>     at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.getEntryWithData(SegmentedRaftLog.java:301)
>     at 
> org.apache.ratis.server.leader.LogAppenderBase.newAppendEntriesRequest(LogAppenderBase.java:240)
>     at 
> org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:387)
>     at 
> org.apache.ratis.grpc.server.GrpcLogAppender.run(GrpcLogAppender.java:262)    
> at 
> org.apache.ratis.server.leader.LogAppenderDaemon.run(LogAppenderDaemon.java:80)
>     at java.lang.Thread.run(Thread.java:748) {code}
> 2. A cascading {{NullPointerException}} in {{LogSegment.loadCache()}} due to 
> incomplete log loading
> {code:java}
> 2025-01-18 00:29:32,055 
> [13@group-00020000000F->15-GrpcLogAppender-LogAppenderDaemon] WARN  
> o.a.r.s.l.LogAppenderDaemon:89 - 
> 13@group-00020000000F->15-GrpcLogAppender-LogAppenderDaemon failed 
> org.apache.ratis.server.raftlog.RaftLogIOException: 
> java.lang.NullPointerException    at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:350)
>     at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.get(SegmentedRaftLog.java:296)
>     at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.getEntryWithData(SegmentedRaftLog.java:301)
>     at 
> org.apache.ratis.server.leader.LogAppenderBase.newAppendEntriesRequest(LogAppenderBase.java:240)
>     at 
> org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:387)
>     at 
> org.apache.ratis.grpc.server.GrpcLogAppender.run(GrpcLogAppender.java:262)    
> at 
> org.apache.ratis.server.leader.LogAppenderDaemon.run(LogAppenderDaemon.java:80)
>     at java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.NullPointerException: null    at 
> java.util.Objects.requireNonNull(Objects.java:203)    at 
> org.apache.ratis.server.raftlog.segmented.LogSegment$LogEntryLoader.load(LogSegment.java:247)
>     at 
> org.apache.ratis.server.raftlog.segmented.LogSegment.loadCache(LogSegment.java:348)
>     ... 7 common frames omitted  {code}
>  
> *The relevant code is:* 
> !image-2025-03-25-14-40-32-711.png!
> !image-2025-03-25-15-15-50-750.png|width=1001,height=633!
>  
> {*}We expect Behaviors are like{*}:
>  * Graceful termination of {{GrpcLogAppender}} thread without interrupting 
> in-progress I/O operations.
>  * Proper resource cleanup (e.g., file handles) before thread termination.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to