[ 
https://issues.apache.org/jira/browse/IGNITE-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Kovalenko updated IGNITE-7467:
------------------------------------
    Fix Version/s: 2.5

> Verify partition update counters and sizes on partition map exchange
> --------------------------------------------------------------------
>
>                 Key: IGNITE-7467
>                 URL: https://issues.apache.org/jira/browse/IGNITE-7467
>             Project: Ignite
>          Issue Type: Improvement
>          Components: persistence
>    Affects Versions: 2.1
>            Reporter: Alexey Goncharuk
>            Assignee: Alexey Goncharuk
>            Priority: Major
>             Fix For: 2.5
>
>         Attachments: partitions_update_counters_and_sizes_affected_tests
>
>
> In Ignite we heavily rely on an invariant that under no load owning 
> partitions will have equal sizes and, more importantly, equal partition 
> counters. This invariant becomes even more important when persistence is 
> enabled.
> However, due to a possible bug in the code, this invariant can be violated 
> which in a long run may lead to an undetected data loss. We need to take best 
> effort to detect such a situation as soon as possible.
> Currently, we already send partition update counters during partition map 
> exchange. We can also send partition sizes and verify that corresponding 
> partitions in OWNING state have equal partition update counters and sizes.
> If a divergence detected, we can:
> 1) Always print out an error message to the log
> 2) Move the corresponding caches to the read-only state to prevent further 
> corruption or operating on invalid data
> Also, we can introduce a ./control.sh command which will trigger an empty 
> exchange to verify the partition states.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to