[ 
https://issues.apache.org/jira/browse/IGNITE-27097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-27097:
---------------------------------------
    Description: 
When starting to accept a Raft snapshot to a replica, we set lastAppliedIndex 
on all storages of the replica to -1 (aka REBALANCE_IN_PROGRESS).

When recovering the corresponding table in TableManager, we check whether any 
of its storages has lastAppliedIndex set to REBALANCE_IN_PROGRESS. If this is 
true, we understand that the rebalance was initiated but was not able to 
complete, so we clear the storages.

But for per-zone mode (aka colocation mode) we don't check for 
REBALANCE_IN_PROGRESS in tx state storage of the starting zone partition 
replica. This should be fixed.

There is also another potential problem. Imagine that we find that MV storage's 
index is REBALANCE_IN_PROGRESS while tx state storage's is not. We initiate 
cleaning of both of the storages. MV storage gets cleaned and persisted, but tx 
state storage doesn't; then the electricity goes away. We did not clean the tx 
state storage, but we lost information that we needed to clean the replica 
storages up.

Please note that currently non-colocation mode is being removed, so the 
corresponding code could disappear from TableManager to the moment when this 
ticket is taken to work.

> Fully support REBALANCE_IN_PROGRESS case with colocation
> --------------------------------------------------------
>
>                 Key: IGNITE-27097
>                 URL: https://issues.apache.org/jira/browse/IGNITE-27097
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> When starting to accept a Raft snapshot to a replica, we set lastAppliedIndex 
> on all storages of the replica to -1 (aka REBALANCE_IN_PROGRESS).
> When recovering the corresponding table in TableManager, we check whether any 
> of its storages has lastAppliedIndex set to REBALANCE_IN_PROGRESS. If this is 
> true, we understand that the rebalance was initiated but was not able to 
> complete, so we clear the storages.
> But for per-zone mode (aka colocation mode) we don't check for 
> REBALANCE_IN_PROGRESS in tx state storage of the starting zone partition 
> replica. This should be fixed.
> There is also another potential problem. Imagine that we find that MV 
> storage's index is REBALANCE_IN_PROGRESS while tx state storage's is not. We 
> initiate cleaning of both of the storages. MV storage gets cleaned and 
> persisted, but tx state storage doesn't; then the electricity goes away. We 
> did not clean the tx state storage, but we lost information that we needed to 
> clean the replica storages up.
> Please note that currently non-colocation mode is being removed, so the 
> corresponding code could disappear from TableManager to the moment when this 
> ticket is taken to work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to