[ 
https://issues.apache.org/jira/browse/IGNITE-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-21588:
-----------------------------------
    Description: 
When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we 
do the following:
 * Read local state with {{{}readLogicalTopology(){}}}.
 * Modify state according to the command.
 * {*}Increase version{*}.
 * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}.

The problem lies in reading and writing of the state - it' local, and version 
value is not replicated.

What happens when we restart the node:
 * It starts without local storage snapshot, with appliedIndex == 0, which is a 
{*}state in the past{*}.
 * We apply commands that were already applied before restart.
 * We apply these commands to locally saved topology snapshot.
 * This logical topology snapshot has a *state in the future* when compared to 
appliedIndex == 0.
 * As a result, when we re-apply some commands, we *increase the version* one 
more time, thus breaking data consistency between nodes.

This would have been fine if we only used this version locally. But 
distribution zones rely on the consistency of the version between all nodes in 
cluster. This might break DZ data nodes handling if any of the cluster nodes 
restarts.

How to fix:
 * Either drop the storage if there's no storage snapshot, this will restore 
consistency
 * or never start CMG group from a snapshot, but rather start it from the 
latest storage data.

  was:
When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we 
do the following:
 * Read local state with {{{}readLogicalTopology(){}}}.
 * Modify state according to the command.
 * {*}Increase version{*}.
 * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}.

The problem lies in reading and writing of the state - it's local, and version 
value is not replicated.

What happens when we restart the node:
 * It starts with local storage snapshot, which is a {*}state in the past{*}, 
generally speaking.
 * We apply commands that were not applied in the snapshot.
 * We apply these commands to locally saved topology snapshot.
 * This logical topology snapshot has a *state in the future* when compared to 
storage snapshot.
 * As a result, when we re-apply some commands, we *increase the version* one 
more time, thus breaking data consistency between nodes.

This would have been fine if we only used this version locally. But 
distribution zones rely on the consistency of the version between all nodes in 
cluster. This might break DZ data nodes handling if any of the cluster nodes 
restarts.


> CMG commands idempotency is broken
> ----------------------------------
>
>                 Key: IGNITE-21588
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21588
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Bessonov
>            Priority: Major
>              Labels: ignite-3
>
> When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we 
> do the following:
>  * Read local state with {{{}readLogicalTopology(){}}}.
>  * Modify state according to the command.
>  * {*}Increase version{*}.
>  * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}.
> The problem lies in reading and writing of the state - it' local, and version 
> value is not replicated.
> What happens when we restart the node:
>  * It starts without local storage snapshot, with appliedIndex == 0, which is 
> a {*}state in the past{*}.
>  * We apply commands that were already applied before restart.
>  * We apply these commands to locally saved topology snapshot.
>  * This logical topology snapshot has a *state in the future* when compared 
> to appliedIndex == 0.
>  * As a result, when we re-apply some commands, we *increase the version* one 
> more time, thus breaking data consistency between nodes.
> This would have been fine if we only used this version locally. But 
> distribution zones rely on the consistency of the version between all nodes 
> in cluster. This might break DZ data nodes handling if any of the cluster 
> nodes restarts.
> How to fix:
>  * Either drop the storage if there's no storage snapshot, this will restore 
> consistency
>  * or never start CMG group from a snapshot, but rather start it from the 
> latest storage data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to