[
https://issues.apache.org/jira/browse/IGNITE-18495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654140#comment-17654140
]
Roman Puchkovskiy commented on IGNITE-18495:
--------------------------------------------
It looks like we can just port the corresponding fix from JRaft:
[https://github.com/sofastack/sofa-jraft/commit/a1fa8c30a895fc45f39c9661eb383a890ba4b6d8]
> Fix RAFT snapshot installation hang due to response swap on retry
> -----------------------------------------------------------------
>
> Key: IGNITE-18495
> URL: https://issues.apache.org/jira/browse/IGNITE-18495
> Project: Ignite
> Issue Type: Bug
> Reporter: Roman Puchkovskiy
> Assignee: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The scenario follows:
> # InstallSnapshot request is sent, its processing starts hanging forever (it
> will be cancelled on step 3)
> # After a timeout, second InstallSnapshot request is sent with same
> index+term as the first had; in JRaft, it causes a special handling (previous
> request processing is NOT cancelled)
> # After a timeout, third InstallSnapshot request is sent with DIFFERENT
> index, so it cancels the first snapshot processing effectively unblocking the
> first thread
> In the original JRaft implementation, after being unblocked, the first thread
> fails to clean up, so subsequent retries will always see a phantom of an
> unfinished snapshot, so the snapshotting process will be jammed. Also, node
> stop might stuck because one 'download' task will remain unfinished forever.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)