[
https://issues.apache.org/jira/browse/RATIS-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Song Ziyang updated RATIS-1587:
-------------------------------
Description:
*Bug Description*
Currently, when follower install snapshot from leader, leader will divide
snapshot files into a sequence of fixed size chunks (16MB) and send each
through rpc.
!image-2022-05-31-08-56-35-373.png!
only the last rpc request in the sequence is tagged with 'Done'.
!image-2022-05-31-08-58-44-717.png!
However, when follower handles these sequence of rpcs, it will create *a random
temp dir for each rpc request, store the chunk in, and only move the last chunk
from tmp dir to sm dir.*
*!image-2022-05-31-09-00-41-286.png!*
Thus, when snapshot contains multiple files or a single file larger than 16MB,
InstallSnapshot will fail because only last chunk is stored in the /sm dir and
others are remained in many tmp dirs.
*How To Fix*
Instead of use random uuid to name tmp dir every time, it is possible to use
the *request-uuid* to name the tmp dir. request-id is produced by leader and
is shared among the sequence of requests.
was:
*Bug Description*
Currently, when follower install snapshot from leader, leader will divide
snapshot files into a sequence of fixed size chunks (16MB) and send each
through rpc.
!image-2022-05-31-08-56-35-373.png!
only the last rpc request in the sequence is tagged with 'Done'.
!image-2022-05-31-08-58-44-717.png!
However, when follower handles these sequence of rpcs, it will create *a random
temp dir for each rpc request, store the chunk in, and only move the last chunk
from tmp dir to sm dir.*
*!image-2022-05-31-09-00-41-286.png!*
Thus, ** when snapshot contains multiple files or a single file larger than
16MB, ** InstallSnapshot will fail because only last chunk is stored in the /sm
dir and others are remained in many tmp dirs.
*How To Fix*
Instead of use random uuid to name tmp dir every time, it is possible to use
the request-id to name the tmp dir. request-id is produced by leader and is
shared among the sequence of requests.
> InstallSnapshot fails when snapshot has multiple chunks
> -------------------------------------------------------
>
> Key: RATIS-1587
> URL: https://issues.apache.org/jira/browse/RATIS-1587
> Project: Ratis
> Issue Type: Bug
> Components: server
> Affects Versions: 2.3.0
> Reporter: Song Ziyang
> Priority: Major
> Attachments: image-2022-05-31-08-56-35-373.png,
> image-2022-05-31-08-58-44-717.png, image-2022-05-31-09-00-41-286.png
>
>
> *Bug Description*
> Currently, when follower install snapshot from leader, leader will divide
> snapshot files into a sequence of fixed size chunks (16MB) and send each
> through rpc.
> !image-2022-05-31-08-56-35-373.png!
> only the last rpc request in the sequence is tagged with 'Done'.
> !image-2022-05-31-08-58-44-717.png!
> However, when follower handles these sequence of rpcs, it will create *a
> random temp dir for each rpc request, store the chunk in, and only move the
> last chunk from tmp dir to sm dir.*
> *!image-2022-05-31-09-00-41-286.png!*
> Thus, when snapshot contains multiple files or a single file larger than
> 16MB, InstallSnapshot will fail because only last chunk is stored in the /sm
> dir and others are remained in many tmp dirs.
>
> *How To Fix*
> Instead of use random uuid to name tmp dir every time, it is possible to use
> the *request-uuid* to name the tmp dir. request-id is produced by leader and
> is shared among the sequence of requests.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)