Song Ziyang created RATIS-1587:
----------------------------------
Summary: InstallSnapshot fails when snapshot has multiple chunks
Key: RATIS-1587
URL: https://issues.apache.org/jira/browse/RATIS-1587
Project: Ratis
Issue Type: Bug
Components: server
Affects Versions: 2.3.0
Reporter: Song Ziyang
Attachments: image-2022-05-31-08-56-35-373.png,
image-2022-05-31-08-58-44-717.png, image-2022-05-31-09-00-41-286.png
*Bug Description*
Currently, when follower install snapshot from leader, leader will divide
snapshot files into a sequence of fixed size chunks (16MB) and send each
through rpc.
!image-2022-05-31-08-56-35-373.png!
only the last rpc request in the sequence is tagged with 'Done'.
!image-2022-05-31-08-58-44-717.png!
However, when follower handles these sequence of rpcs, it will create *a random
temp dir for each rpc request, store the chunk in, and only move the last chunk
from tmp dir to sm dir.*
*!image-2022-05-31-09-00-41-286.png!*
Thus, ** when snapshot contains multiple files or a single file larger than
16MB, ** InstallSnapshot will fail because only last chunk is stored in the /sm
dir and others are remained in many tmp dirs.
*How To Fix*
Instead of use random uuid to name tmp dir every time, it is possible to use
the request-id to name the tmp dir. request-id is produced by leader and is
shared among the sequence of requests.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)