[ 
https://issues.apache.org/jira/browse/GEODE-9822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9822:
--------------------------------
    Description: 
In a two-locator cluster with default member weights and default setting (true) 
of enable-network-partition-detection, if a long-lived network partition 
separates the two members, a split-brain will arise: there will be two 
coordinators at the same time.

The reason for this can be found in the GMSJoinLeave.isNetworkPartition() 
method. That method's name is misleading. A name like isMajorityLost() would 
probably be more apt. It needs to return true iff the weight of "crashed" 
members (in the prospective view) is greater-than-or-equal-to 50% of the total 
weight (of all members in the current view).

What the method actually does is return true iff the weight of "crashed" 
members is greater-than 51% of the total weight. As a result, if we have two 
members of equal weight, and the coordinator sees that the non-coordinator is 
"crashed", the coordinator will keep running. If a network partition is 
happening, and the non-coordinator is still running, then it will become a 
coordinator and start producing views. Now we'll have two coordinators 
producing views concurrently.

For this discussion "crashed" members are members for which the coordinator has 
received a RemoveMemberRequest message. These are members that the failure 
detector has deemed failed. Keep in mind the failure detector is imperfect 
(it's not always right), and that's kind of the whole point of this ticket: 
we've lost contact with the non-coordinator member, but that doesn't mean it 
can't still be running (on the other side of a partition).

  was:
In a two-locator cluster with default member weights and default setting (true) 
of enable-network-partition-detection, if a long-lived network partition 
separates the two members, a split-brain will arise: there will be two 
coordinators at the same time.

The reason for this can be found in the GMSJoinLeave.isNetworkPartition() 
method. That method's name is misleading. A name like majorityLost() would 
probably be more apt. It needs to return true iff the weight of "crashed" 
members (in the prospective view) is greater-than-or-equal-to 50% of the total 
weight (of all members in the current view).

What the method actually does is return true iff the weight of "crashed" 
members is greater-than 51% of the total weight. As a result, if we have two 
members of equal weight, and the coordinator sees that the non-coordinator is 
"crashed", the coordinator will keep running. If a network partition is 
happening, and the non-coordinator is still running, then it will become a 
coordinator and start producing views. Now we'll have two coordinators 
producing views concurrently.

For this discussion "crashed" members are members for which the coordinator has 
received a RemoveMemberRequest message. These are members that the failure 
detector has deemed failed. Keep in mind the failure detector is imperfect 
(it's not always right), and that's kind of the whole point of this ticket: 
we've lost contact with the non-coordinator member, but that doesn't mean it 
can't still be running (on the other side of the partition).


> Split-brain Possible During Network Partition in Two-Locator Cluster
> --------------------------------------------------------------------
>
>                 Key: GEODE-9822
>                 URL: https://issues.apache.org/jira/browse/GEODE-9822
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bill Burcham
>            Priority: Major
>              Labels: pull-request-available
>
> In a two-locator cluster with default member weights and default setting 
> (true) of enable-network-partition-detection, if a long-lived network 
> partition separates the two members, a split-brain will arise: there will be 
> two coordinators at the same time.
> The reason for this can be found in the GMSJoinLeave.isNetworkPartition() 
> method. That method's name is misleading. A name like isMajorityLost() would 
> probably be more apt. It needs to return true iff the weight of "crashed" 
> members (in the prospective view) is greater-than-or-equal-to 50% of the 
> total weight (of all members in the current view).
> What the method actually does is return true iff the weight of "crashed" 
> members is greater-than 51% of the total weight. As a result, if we have two 
> members of equal weight, and the coordinator sees that the non-coordinator is 
> "crashed", the coordinator will keep running. If a network partition is 
> happening, and the non-coordinator is still running, then it will become a 
> coordinator and start producing views. Now we'll have two coordinators 
> producing views concurrently.
> For this discussion "crashed" members are members for which the coordinator 
> has received a RemoveMemberRequest message. These are members that the 
> failure detector has deemed failed. Keep in mind the failure detector is 
> imperfect (it's not always right), and that's kind of the whole point of this 
> ticket: we've lost contact with the non-coordinator member, but that doesn't 
> mean it can't still be running (on the other side of a partition).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to