[jira] [Comment Edited] (IGNITE-23550) Test and optimize metastorage snapshot transfer and recovery speed for new nodes

Kirill Tkalenko (Jira) Sat, 07 Dec 2024 02:31:04 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-23550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903806#comment-17903806
 ]


Kirill Tkalenko edited comment on IGNITE-23550 at 12/7/24 10:30 AM:
--------------------------------------------------------------------

Based on the results of running tests from [PR 
4845|https://github.com/apache/ignite-3/pull/4845] locally and TС I made 
several conclusions.

As the analysis of jfr has shown, we spend quite a lot of time saving the 
checksum on each write to the metastorage since it is saved in the synchronous 
mode to a WAL.

Results of executing *MetaStorageManager#put* 100k times with/without the sync 
mode of the checksum:
||Disable sync for checsum||TC/Local||Time||
|True|TC|4s 12ms 204us 192866ns, totalMs=4012, totalNs=4012204866|
|False|TC|22s 12ms 732us 720751ns, totalMs=22012, totalNs=22012732751|
|True|Local|2s 218ms 965us 747459ns, totalMs=2218, totalNs=2218965459|
|False|Local|5m 10s 157ms 117us, totalMs=310157, totalNs=310157117417|
>From the table we can conclude that disabling sync with WAL for the checksum 
>will increase our performance several times. How can we optimize it? I think 
>we can invasively infiltrate the mechanism of sending raft commands and write 
>the required checksum as an additional field and then work with it since raft 
>commands are already written in a sync mode with WAL.

Result node restart with/without the sync mode of the checksum with 100k put in 
raft log:
||Disable sync for checsum||TC/Local||Time||
|True|TC|4s 744ms 454us, totalMs=4744, totalNs=4744454870|
|False|TC|24s 195ms 34us, totalMs=24195, totalNs=24195034186|
|True|Local|2s 877ms 511us, totalMs=2877, totalNs=2877511875|
|False|Local|6m 41s 771ms 593us, totalMs=401771, totalNs=401771593084|
This table also shows that there will be a performance gain if we fix the 
situation with the checksum sync mode.


was (Author: [email protected]):
Based on the results of running tests from [PR 
4845|https://github.com/apache/ignite-3/pull/4845] locally and TС I made 
several conclusions.

As the analysis of jfr has shown, we spend quite a lot of time saving the 
checksum on each write to the metastorage since it is saved in the synchronous 
mode to a WAL.

Results of executing *MetaStorageManager#put* 100k times with/without the 
synchronous write mode of the checksum:
||Disable sync for checsum||TC/Local||Time||
|True|TC|4s 12ms 204us 192866ns, totalMs=4012, totalNs=4012204866|
|False|TC|22s 12ms 732us 720751ns, totalMs=22012, totalNs=22012732751|
|True|Local|2s 218ms 965us 747459ns, totalMs=2218, totalNs=2218965459|
|False|Local|5m 10s 157ms 117us, totalMs=310157, totalNs=310157117417|
>From the table we can conclude that disabling sync with WAL for the checksum 
>will increase our performance several times. How can we optimize it? I think 
>we can invasively infiltrate the mechanism of sending raft commands and write 
>the required checksum as an additional field and then work with it since raft 
>commands are already written in a sync mode with WAL.

> Test and optimize metastorage snapshot transfer and recovery speed for new 
> nodes
> --------------------------------------------------------------------------------
>
>                 Key: IGNITE-23550
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23550
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Assignee: Kirill Tkalenko
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Test and optimize metastorage snapshot transfer and recovery speed for new 
> nodes.
> Let's assume that we have a 100Mb+ meta-storage snapshot and 100k+ entries in 
> raft log replicated as log.
> How long would it take for a new node to join the cluster under these 
> conditions? Will something break? What can we do to make it work?
> Goal is - the joining process should work for a long-running clusters. It 
> should be pretty fast as well. Less than 10 seconds for sure, of course 
> depending on the network capabilities. No timeout errors should occur if it 
> takes more than 10 seconds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (IGNITE-23550) Test and optimize metastorage snapshot transfer and recovery speed for new nodes

Reply via email to