Alexey Kukushkin created IGNITE-17657: -----------------------------------------
Summary: Partition data loss after rolling restart Key: IGNITE-17657 URL: https://issues.apache.org/jira/browse/IGNITE-17657 Project: Ignite Issue Type: Bug Affects Versions: 2.13 Reporter: Alexey Kukushkin *Setup* An active 3+ node Apache Ignite 2.13 cluster with a cache with 1 backup, enabled persistence and default partition loss policy. *Actions* # An application is continuously writing data to the cache # The nodes are sequentially restarted one after another while the data is being written to the cache. The next node is restarted only after the data rebalancing is complete. Using the {{KeysToRebalanceLeft}} metric to monitor rebalancing (see the [documentation|https://ignite.apache.org/docs/latest/monitoring-metrics/metrics#monitoring-rebalancing] for more details) # The application reads some of the data after restarting all the nodes. *Expected* No data is lost since there is 1 backup and the nodes are restarted sequentially after rebalancing is complete. *Actual* Sometimes (in our case in more than 50% of cases) there is a "partition data has been lost" exception on the attempt to read the data. *Notes* Tried to create a JUnit reproducer (all nodes within the same JVM) for the above scenario - no success so far. -- This message was sent by Atlassian Jira (v8.20.10#820010)