Here is the jira issue: https://issues.apache.org/jira/browse/RATIS-1564, 
please take a look : )
Thanks

> 2022年4月11日 16:36,Tsz Wo Sze <[email protected]> 写道:
> 
> Hi William,
> 
> Thanks a lot for reporting the bug.  Could you file a JIRA?
> 
> Tsz-Wo
> 
> 
> On Mon, Apr 11, 2022 at 4:24 PM 宋子阳 <[email protected]> wrote:
> 
>> Hi folks,
>> 
>> I’ve discovered a bug in installSnapshot RPC handler, causing the follower
>> to reply success where it actually failed.
>> 
>> org.apache.ratis.server.storage.SnapshotManager.java
>> 
>> public void installSnapshot(StateMachine stateMachine,
>> InstallSnapshotRequestProto request) throws IOException {
>> ...
>> if (snapshotChunkRequest.getDone()) {
>>    LOG.info("Install snapshot is done, renaming tnp dir:{} to:{}",
>>        tmpDir, dir.getStateMachineDir());
>>    dir.getStateMachineDir().delete(); // Here delete() may fail
>>    tmpDir.renameTo(dir.getStateMachineDir());
>>    }
>> }
>> 
>> 
>> After the follower receives the entire snapshot data, it will first store
>> the file in a tmp dir, then renames to StateMachineDir. However, when the
>> StateMachineDir is not empty, delete() will fail, and renamTo() will fail
>> too. Under this scenario, the latest snapshot file will remain in tmp dir
>> and the statemachine cannot fetch the this snapshot.
>> 
>> The StateMachineDir can be non-empty since the old installed snapshots are
>> stored in StateMachineDir and may not be cleaned up due to retention
>> policy, next time when leader want to install snapshot again this
>> circumstance will appear.
>> 
>> Thanks!
>> 
>> William Song
>> Apache IoTDB

Reply via email to