[
https://issues.apache.org/jira/browse/IGNITE-24772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denis Chudov updated IGNITE-24772:
----------------------------------
Summary: Data loss in in-memory group after several node restarts without
losing majority of Ignite nodes in any moment of time (was: Data loss in
in-memory group after several node restarts without losing majority in any
moment of time)
> Data loss in in-memory group after several node restarts without losing
> majority of Ignite nodes in any moment of time
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-24772
> URL: https://issues.apache.org/jira/browse/IGNITE-24772
> Project: Ignite
> Issue Type: Bug
> Reporter: Denis Chudov
> Priority: Major
> Labels: ignite-3
>
> *Scenario:*
> Nodes: A,B,C.
> A is a leader.
> Client writes some data, data is replicated to A and B, committed on the
> leader (A) and the write operation succeeds from client's POV.
> A fails, then returns to the cluster. No data saved on A because it is
> in-memory.
> The cluster tries to include A as a clean node, it tries to exclude it from
> the configuration and include again, but configuration is not applied for
> some time because there is no leader and may be some temporary network issues
> preventing the write of new data.
> Then the user (that thinks that the majority would be preserved) restarts
> node B. It also loses the data.
> Let's say that data wasn't even replicated on C.
> As a result, the data is lost.
>
> *Ignite specifics:*
> Before starting the node during the restart, it is removed from the
> configuration and then included again. Actually, it is started only when
> including it back. So the scenario will be slightly different:
> When A is started, it is removed from the configuration.
> Node B is stopped. Now the majority is lost and full group restart is
> required.
> User will need a group restart, while keeping {*}the majority of Ignite nodes
> online{*}. It leads to the data loss.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)