lost partition recovery with native persistence

novicr Fri, 05 Oct 2018 06:18:57 -0700

I was going over failure recovery scenarios, trying to understand logic
behind lost partitions functionality.  In the case of native persistence,
Ignite fully manages data persistence and availability.  If enough nodes in
the cluster become unavailable resulting in partitions marked lost, Ignite
keeps track of those partitions.  When nodes rejoin the cluster partitions
are automatically discovered and loaded from disk.  This can be shown by the
fact that data actually becomes available and can be retrieved using normal
get/query api's.  However, lostPartitions() lists still contain some
partitions that were previously lost (this seems like a bug) and Ignite
expects user to manually mark partitions available by calling
Ignite.resetLostPartitions() api.


I found some discussion about issues with topology version handling in
resetLostPartitions() in this ticket:  IGNITE-7832
<https://issues.apache.org/jira/browse/IGNITE-7832>  , but it does not
address the question, why user involvement is required at all.

Seems there should, at least, be a configuration option to allow cache to
self-recover once all partitions become available.

This email was originally sent to the user group: 
lost-partition-recovery-with-native-persistence
<http://apache-ignite-users.70518.x6.nabble.com/lost-partition-recovery-with-native-persistence-td24520.html>
  



Thanks,
Roman



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

lost partition recovery with native persistence

Reply via email to