[ 
https://issues.apache.org/jira/browse/RATIS-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Song Ziyang updated RATIS-1587:
-------------------------------
    Description: 
*Bug Description*

Currently, when follower install snapshot from leader, leader will divide 
snapshot files into a sequence of fixed size chunks (16MB) and send each 
through rpc.

!image-2022-05-31-08-56-35-373.png!

only the last rpc request in the sequence is tagged with 'Done'.

!image-2022-05-31-08-58-44-717.png!

However, when follower handles these sequence of rpcs, it will create *a random 
temp dir for each rpc request, store the chunk in, and only move the last chunk 
from tmp dir to sm dir.*

*!image-2022-05-31-09-00-41-286.png!*

Thus, when snapshot contains multiple files or a single file larger than 16MB, 
InstallSnapshot will fail because only last chunk is stored in the /sm dir and 
others are remained in many tmp dirs.

 

*How To Fix*

Instead of use random uuid to name tmp dir every time, it is possible to use 
the *request-uuid*  to name the tmp dir. request-id is produced by leader and 
is shared among the sequence of requests.

 

  was:
*Bug Description*

Currently, when follower install snapshot from leader, leader will divide 
snapshot files into a sequence of fixed size chunks (16MB) and send each 
through rpc.

!image-2022-05-31-08-56-35-373.png!

only the last rpc request in the sequence is tagged with 'Done'.

!image-2022-05-31-08-58-44-717.png!

However, when follower handles these sequence of rpcs, it will create *a random 
temp dir for each rpc request, store the chunk in, and only move the last chunk 
from tmp dir to sm dir.*

*!image-2022-05-31-09-00-41-286.png!*

Thus, ** when snapshot contains multiple files or a single file larger than 
16MB, ** InstallSnapshot will fail because only last chunk is stored in the /sm 
dir and others are remained in many tmp dirs.

 

*How To Fix*

Instead of use random uuid to name tmp dir every time, it is possible to use 
the request-id  to name the tmp dir. request-id is produced by leader and is 
shared among the sequence of requests.

 


> InstallSnapshot fails when snapshot has multiple chunks
> -------------------------------------------------------
>
>                 Key: RATIS-1587
>                 URL: https://issues.apache.org/jira/browse/RATIS-1587
>             Project: Ratis
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 2.3.0
>            Reporter: Song Ziyang
>            Priority: Major
>         Attachments: image-2022-05-31-08-56-35-373.png, 
> image-2022-05-31-08-58-44-717.png, image-2022-05-31-09-00-41-286.png
>
>
> *Bug Description*
> Currently, when follower install snapshot from leader, leader will divide 
> snapshot files into a sequence of fixed size chunks (16MB) and send each 
> through rpc.
> !image-2022-05-31-08-56-35-373.png!
> only the last rpc request in the sequence is tagged with 'Done'.
> !image-2022-05-31-08-58-44-717.png!
> However, when follower handles these sequence of rpcs, it will create *a 
> random temp dir for each rpc request, store the chunk in, and only move the 
> last chunk from tmp dir to sm dir.*
> *!image-2022-05-31-09-00-41-286.png!*
> Thus, when snapshot contains multiple files or a single file larger than 
> 16MB, InstallSnapshot will fail because only last chunk is stored in the /sm 
> dir and others are remained in many tmp dirs.
>  
> *How To Fix*
> Instead of use random uuid to name tmp dir every time, it is possible to use 
> the *request-uuid*  to name the tmp dir. request-id is produced by leader and 
> is shared among the sequence of requests.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to