Ivan, Thanks for the pointer to discussion. It doesn't actually address the point around the need for 'resetLostPartitions()'. It does point to a ticket that would fix the logic in it when BLT is used. My concern is that Ignite relies on the user to call this method at all.
Original message: I was going over failure recovery scenarios, trying to understand logic behind lost partitions functionality. In the case of native persistence, Ignite fully manages data persistence and availability. If enough nodes in the cluster become unavailable resulting in partitions marked lost, Ignite keeps track of those partitions. When nodes rejoin the cluster partitions are automatically discovered and loaded from disk. This can be shown by the fact that data actually becomes available and can be retrieved using normal get/query api's. However, lostPartitions() lists still contain some partitions that were previously lost (this seems like a bug) and Ignite expects user to manually mark partitions available by calling Ignite.resetLostPartitions() api. I found some discussion about issues with topology version handling in resetLostPartitions() in this ticket: IGNITE-7832 <https://issues.apache.org/jira/browse/IGNITE-7832> , but it does not address the question, why user involvement is required at all. Seems there should, at least, be a configuration option to allow cache to self-recover once all partitions become available. This email was originally sent to the user group: lost-partition-recovery-with-native-persistence <http://apache-ignite-users.70518.x6.nabble.com/lost-partition-recovery-with-native-persistence-td24520.html> -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/