[ 
https://issues.apache.org/jira/browse/IGNITE-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-17738:
--------------------------------------
    Summary: Cluster must be able to fix the inconsistency on restart by itself 
 (was: Historical rebalance must be able to fix the inconsistency on cluster 
restart by itself)

> Cluster must be able to fix the inconsistency on restart by itself
> ------------------------------------------------------------------
>
>                 Key: IGNITE-17738
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17738
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Anton Vinogradov
>            Priority: Major
>              Labels: iep-31, ise
>
> On cluster restart (because of power-off, OOM or some other problem) it's 
> possible to have PDS inconsistent (primary partitions may contain operations 
> missed on backups).
> Currently, "historical rebalance" is able to sync the data to the highest LWM 
> for every partition. 
> Most likely, a primary will be chosen as a rebalance source, but the data 
> after the LWM will not be rebalanced. So, all updates between LWM and HWM 
> will not be synchronized.
> A possible solution for the case when the cluster failed and restarted (same 
> baseline) is to fix counters to help "historical rebalance" perform the sync.
> Counters should be set as
>  - HWM at primary and as LWM at backups for caches with 2+ backups,
>  - LWM at primary and as HWM at backups for caches with a single backup.
> Possible solutions:
>  * This can be implemented as an extension for the "-consistency finalize` 
> command, for example `-consistency finalize-on-restart` or
>  * Counters can be finalized automatically when cluster composition is equal 
> to the baseline specified before the crash (preferred)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to