Here is the jira issue: https://issues.apache.org/jira/browse/RATIS-1564, please take a look : ) Thanks
> 2022年4月11日 16:36,Tsz Wo Sze <[email protected]> 写道: > > Hi William, > > Thanks a lot for reporting the bug. Could you file a JIRA? > > Tsz-Wo > > > On Mon, Apr 11, 2022 at 4:24 PM 宋子阳 <[email protected]> wrote: > >> Hi folks, >> >> I’ve discovered a bug in installSnapshot RPC handler, causing the follower >> to reply success where it actually failed. >> >> org.apache.ratis.server.storage.SnapshotManager.java >> >> public void installSnapshot(StateMachine stateMachine, >> InstallSnapshotRequestProto request) throws IOException { >> ... >> if (snapshotChunkRequest.getDone()) { >> LOG.info("Install snapshot is done, renaming tnp dir:{} to:{}", >> tmpDir, dir.getStateMachineDir()); >> dir.getStateMachineDir().delete(); // Here delete() may fail >> tmpDir.renameTo(dir.getStateMachineDir()); >> } >> } >> >> >> After the follower receives the entire snapshot data, it will first store >> the file in a tmp dir, then renames to StateMachineDir. However, when the >> StateMachineDir is not empty, delete() will fail, and renamTo() will fail >> too. Under this scenario, the latest snapshot file will remain in tmp dir >> and the statemachine cannot fetch the this snapshot. >> >> The StateMachineDir can be non-empty since the old installed snapshots are >> stored in StateMachineDir and may not be cleaned up due to retention >> policy, next time when leader want to install snapshot again this >> circumstance will appear. >> >> Thanks! >> >> William Song >> Apache IoTDB
