[ 
https://issues.apache.org/jira/browse/IGNITE-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-16668:
-------------------------------------
    Description: 
If a node storing a partition of an in-memory table fails and leaves the 
cluster all data it had is lost. From the point of view of the partition it 
looks like as the node is left forever.

Although Raft protocol tolerates leaving some amount of nodes composing Raft 
group (partition); for in-memory caches we cannot restore replica factor 
because of in-memory nature of the table.

It means that we need to detect failures of each node owning a partition and 
recalculate assignments for the table without keeping replica factor.

  was:
When a node serving as part of Raft group for a particular partition fails data 
it stores is lost.

So keeping failed node in Raft group configuration is wrong, node should be 
removed from the group.

But reconfiguring the group is tricky and may interfere with rebalancing 
protocol for persistence case. So we need to propose and discuss design for 
group reconfiguration separately.


> Raft group reconfiguration on node failure
> ------------------------------------------
>
>                 Key: IGNITE-16668
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16668
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Sergey Chugunov
>            Priority: Major
>              Labels: ignite-3
>
> If a node storing a partition of an in-memory table fails and leaves the 
> cluster all data it had is lost. From the point of view of the partition it 
> looks like as the node is left forever.
> Although Raft protocol tolerates leaving some amount of nodes composing Raft 
> group (partition); for in-memory caches we cannot restore replica factor 
> because of in-memory nature of the table.
> It means that we need to detect failures of each node owning a partition and 
> recalculate assignments for the table without keeping replica factor.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to