[ 
https://issues.apache.org/jira/browse/IGNITE-23910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-23910:
---------------------------------------
    Description: 
The following is possible:
 # Metastorage majority goes down (maybe other nodes that don't vote in 
Metastorage as well)
 # One of those nodes (named A) had a command X in its Metastorage Raft log 
that not a single remaining node has in its log. Before crashing, A had applied 
this command in the Metastorage (this was not flushed to disk), then some 
side-effect was produced via Metastorage watch, and the side effect WAS flushed 
to disk
 # User repairs Metastorage on the remaining nodes (they don't contain X in 
their logs and did not apply it)
 # A is brought back online. X was the only diverging command, and the fact of 
its existence was not saved durably to the Metastorage underlying storage (it 
was saved to the Raft log on A, but we are not looking at logs during reentry 
procedure). As such, Metastorage checksum on A matches the checksum for the new 
Metastorage leader (say, B)
 # As a result, we allow node A join the new cluster, even though it has some 
side-effect of X persisted

Such a side-effect could be a tuple in new schema if X was an ALTER TABLE 
command; A would contain this tuple version, but other nodes would have no idea 
about this new version. This is an inconsistency we want to avoid.

To be continued...

  was:TBD


> Ability to note side effects produced by diverged Metastorage revisions which 
> were lost
> ---------------------------------------------------------------------------------------
>
>                 Key: IGNITE-23910
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23910
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> The following is possible:
>  # Metastorage majority goes down (maybe other nodes that don't vote in 
> Metastorage as well)
>  # One of those nodes (named A) had a command X in its Metastorage Raft log 
> that not a single remaining node has in its log. Before crashing, A had 
> applied this command in the Metastorage (this was not flushed to disk), then 
> some side-effect was produced via Metastorage watch, and the side effect WAS 
> flushed to disk
>  # User repairs Metastorage on the remaining nodes (they don't contain X in 
> their logs and did not apply it)
>  # A is brought back online. X was the only diverging command, and the fact 
> of its existence was not saved durably to the Metastorage underlying storage 
> (it was saved to the Raft log on A, but we are not looking at logs during 
> reentry procedure). As such, Metastorage checksum on A matches the checksum 
> for the new Metastorage leader (say, B)
>  # As a result, we allow node A join the new cluster, even though it has some 
> side-effect of X persisted
> Such a side-effect could be a tuple in new schema if X was an ALTER TABLE 
> command; A would contain this tuple version, but other nodes would have no 
> idea about this new version. This is an inconsistency we want to avoid.
> To be continued...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to