[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2022-01-09 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{{}.dat{}}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator wants to *start up a brand new cluster* with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
*scale it up* by starting/adding a new locator Geode is susceptible to the same 
issues just mentioned
 * if a cluster is already running and the administrator/operator needs to 
*restart* a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.
h2. When This Ticket is Complete

There are a number of possible solutions to these problems. Here is one 
possible solution…

Geode will offer a locator startup mode (via TBD {{LocatorLauncher}} startup 
parameter) that prevents that locator from becoming a coordinator. In that 
mode, it will be possible for an administrator/operator to avoid many of the 
problematic scenarios mentioned above, while retaining the ability (via some 
_other_ mode) to start a first locator which is allowed to become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

After this mode split is implemented, it is envisioned that to start a brand 
new cluster, an administrator/operator will start the first locator in "seed" 
mode. After that the operator will start all subsequent locators in "join only" 
mode. If network partitions occur during startup, those newly started 
("join-only") nodes will exit with a failure status—under no circumstances will 
they ever become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restart it 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-12 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator wants to *start up a brand new cluster* with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
*scale it up* by starting/adding a new locator Geode is susceptible to the same 
issues just mentioned
 * if a cluster is already running and the administrator/operator needs to 
*restart* a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. In that mode, it will be possible for an 
administrator/operator to avoid many of the problematic scenarios mentioned 
above, while retaining the ability (via some _other_ mode) to start a first 
locator which is allowed to become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

After this mode split is implemented, it is envisioned that to start a brand 
new cluster, an administrator/operator will start the first locator in "seed" 
mode. After that the operator will start all subsequent locators in "join only" 
mode. If network partitions occur during startup, those newly started 
("join-only") nodes will exit with a failure status—under no circumstances will 
they ever become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restart it in "join only" mode. If a network partition is encountered, or the 
{{.dat}} file is missing or 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-06 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator/operator to avoid many of the problematic scenarios mentioned 
above, while retaining the ability to start a first locator which is allowed to 
become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, an administrator/operator starts the first 
locator in "seed" mode. After that the operator starts all subsequent locators 
in "join only" mode. If network partitions occur during startup, those newly 
started nodes will exit with a failure status, but will not become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restarted in "join only" mode. If a network partition is encountered, or the 
{{.dat}} file is missing or stale, the new locator will exit with a failure 
status and split brain will be avoided.
h2.  

FAQ

Q: What should happen if a locator is 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-06 Thread Dan Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-9680:
-
Labels:   (was: needsTriage)

> Newly Started/Restarted Locators are Susceptible to Split-Brains
> 
>
> Key: GEODE-9680
> URL: https://issues.apache.org/jira/browse/GEODE-9680
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Priority: Major
>
> The issues described here are present in all versions of Geode (this is not 
> new to 1.15.0)…
> Geode is built on the assumption that views progress linearly in a sequence. 
> If that sequence ever forks into two or more parallel lines then we have a 
> "split brain".
> In a split brain condition, each of the parallel views are independent. It's 
> as if you have more than one system running concurrently. It's possible e.g. 
> for some clients to connect to members of one view and other clients to 
> connect to members of another view. Updates to members in one view are not 
> seen by members of a parallel view.
> Geode views are produced by a coordinator. As long as only a single 
> coordinator is running, there is no possibility of a split brain. Split brain 
> arises when more than one coordinator is producing views at the same time.
> Each Geode member (peer) is started with the {{locators}} configuration 
> parameter. That parameter specifies locator(s) to use to find the (already 
> running!) coordinator (member) to join with.
> When a locator (member) starts, it goes through this sequence to find the 
> coordinator:
>  # it first tries to find the coordinator through one of the (other) 
> configured locators
>  # if it can't contact any of those, it tries contacting non-locator (cache 
> server) members it has retrieved from the "view presistence" ({{.dat}}) file
> If it hasn't found a coordinator to join with, then the locator may _become_ 
> a coordinator.
> Sometimes this is ok. If no other coordinator is currently running then this 
> behavior is fine. An example is when an [administrator is starting up a brand 
> new 
> cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
>  In that case we want the very first locator we start to become the 
> coordinator.
> But there are a number of situations where there may already be another 
> coordinator running but it cannot be reached:
>  * if the administrator/operator is starting up a brand new cluster with 
> multiple locators and…
>  ** maybe Geode is running in a managed environment like Kubernetes and the 
> locators hostnames are not (yet) resolvable in DNS
>  ** maybe there is a network partition between the starting locators so they 
> can't communicate
>  ** maybe the existing locators or coordinator are running very slowly or the 
> network is degraded. This is effectively the same as the network partition 
> just mentioned
>  * if a cluster is already running and the administrator/operator wants to 
> scale it up by starting/adding a new locator Geode is susceptible to that 
> same network partition issue
>  * if a cluster is already running and the administrator/operator needs to 
> restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
> {{locators}} configuration parameter are reachable (maybe they are not 
> running, or maybe there is a network partition) and…
>  ** if the "view persistence" {{.dat}} file is missing or deleted
>  ** or if the current set of running Geode members has evolved so far that 
> the coordinates (host+port) in the {{.dat}} file are completely out of date
> In each of those cases, the newly starting locator will become a coordinator 
> and will start producing views. Now we'll have the old coordinator producing 
> views at the same time as the new one.
> *When this ticket is complete*, Geode will offer a locator startup mode (via 
> TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
> becoming a coordinator. With that mode, it will be possible for an 
> administrator/operator to avoid many of the problematic scenarios mentioned 
> above, while retaining the ability to start a first locator which is allowed 
> to become a coordinator.
> For purposes of discussion we'll call the startup mode that allows the 
> locator to become a coordinator "seed" mode, and we'll call the new startup 
> mode that prevents the locator from becoming a coordinator before first 
> joining, "join-only" mode.
> To start a brand new cluster, an administrator/operator starts the first 
> locator in "seed" mode. After that the operator starts all subsequent 
> locators in "join only" mode. If network partitions occur during startup, 
> those newly started nodes will exit with a failure status, but will not 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator/operator to avoid many of the problematic scenarios mentioned 
above, while retaining the ability to start a first locator which is allowed to 
become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, an administrator/operator starts the first 
locator in "seed" mode. After that the operator starts all subsequent locators 
in "join only" mode. If network partitions occur during startup, those newly 
started nodes will exit with a failure status, but will not become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restarted in "join only" mode. If a network partition is encountered, or the 
{{.dat}} file is missing or stale, the new locator will exit with a failure 
status and split brain will be avoided.
h2. 
FAQ

Q: What should happen if a locator is 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator/operator to avoid many of the problematic scenarios mentioned 
above, while retaining the ability to start a first locator which is allowed to 
become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, an administrator/operator starts the first 
locator in "seed" mode. After that the operator starts all subsequent locators 
in "join only" mode. If network partitions occur during startup, those newly 
started nodes will exit with a failure status, but will not become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restarted in "join only" mode. If a network partition is encountered, or the 
{{.dat}} file is missing or stale, the new locator will exit with a failure 
status and split brain will be avoided.

  was:
The issues described here are present in 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator to avoid many of the problematic scenarios mentioned above, while 
retaining the ability to start a first locator which is allowed to become a 
coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, the first locator is started in "seed" mode. 
After that all subsequent locators are started in "join only" mode. If network 
partitions occur, the newly started nodes will exit with a failure status, but 
will not become coordinators.

To add a locator to a running cluster, it will be started in "join only" mode. 
It will similarly either join with an existing coordinator or exit with a 
failure status, thereby avoiding split brains.

When restarting a locator, e.g. during a rolling upgrade, it will be restarted 
in "join only" mode. If a network partition is encountered, or the {{.dat}} 
file is missing or stale, the locator will exit with a failure status and split 
brain will be avoided.

  was:
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD configuration parameter, {{LocatorLauncher}} startup parameter) that 
prevents that locator from becoming a coordinator. With that mode, it will be 
possible for an administrator to avoid many of the problematic scenarios 
mentioned above, while retaining the ability to start a first locator which is 
allowed to become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, the first locator is started in "seed" mode. 
After that all subsequent locators are started in "join only" mode. If network 
partitions occur, the newly started nodes will exit with a failure status, but 
will not become coordinators.

To add a locator to a running cluster, it will be started in "join only" mode. 
It will similarly either join with an existing coordinator or exit with a 
failure status, thereby avoiding split brains.

When restarting a locator, e.g. during a rolling upgrade, it will be restarted 
in "join only" mode. If a network partition is encountered, or the {{.dat}} 
file is missing or stale, the locator will exit with a failure status and split 
brain will be avoided.

  was:
Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Summary: Newly Started/Restarted Locators are Susceptible to Split-Brains  
(was: Newly Started Locators are Susceptible to Split-Brains)

> Newly Started/Restarted Locators are Susceptible to Split-Brains
> 
>
> Key: GEODE-9680
> URL: https://issues.apache.org/jira/browse/GEODE-9680
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Priority: Major
>  Labels: needsTriage
>
> Geode is built on the assumption that views progress linearly in a sequence. 
> If that sequence ever forks into two or more parallel lines then we have a 
> "split brain".
> In a split brain condition, each of the parallel views are independent. It's 
> as if you have more than one system running concurrently. It's possible e.g. 
> for some clients to connect to members of one view and other clients to 
> connect to members of another view. Updates to members in one view are not 
> seen by members of a parallel view.
> Geode views are produced by a coordinator. As long as only a single 
> coordinator is running, there is no possibility of a split brain. Split brain 
> arises when more than one coordinator is producing views at the same time.
> Each Geode member (peer) is started with the {{locators}} configuration 
> parameter. That parameter specifies locator(s) to use to find the (already 
> running!) coordinator (member) to join with.
> When a locator (member) starts, it goes through this sequence to find the 
> coordinator:
> # it first tries to find the coordinator through one of the (other) 
> configured locators
> # if it can't contact any of those, it tries contacting non-locator (cache 
> server) members it has retrieved from the "view presistence" ({{.dat}}) file
> If it hasn't found a coordinator to join with, then the locator may _become_ 
> a coordinator.
> Sometimes this is ok. If no other coordinator is currently running then this 
> behavior is fine. An example is when an [administrator is starting up a brand 
> new 
> cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
>  In that case we want the very first locator we start to become the 
> coordinator.
> But there are a number of situations where there may already be another 
> coordinator running but it cannot be reached:
> * if the administrator/operator is starting up a brand new cluster with 
> multiple locators and…
> ** maybe Geode is running in a managed environment like Kubernetes and the 
> locators hostnames are not (yet) resolvable in DNS
> ** maybe there is a network partition between the starting locators so they 
> can't communicate
> ** maybe the existing locators or coordinator are running very slowly or the 
> network is degraded. This is effectively the same as the network partition 
> just mentioned
> * if a cluster is already running and the administrator/operator wants to 
> scale it up by starting/adding a new locator Geode is susceptible to that 
> same network partition issue
> * if a cluster is already running and the administrator/operator needs to 
> restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
> {{locators}} configuration parameter are reachable (maybe they are not 
> running, or maybe there is a network partition) and…
> ** if the "view persistence" {{.dat}} file is missing or deleted
> ** or if the current set of running Geode members has evolved so far that the 
> coordinates (host+port) in the {{.dat}} file are completely out of date
> In each of those cases, the newly starting locator will become a coordinator 
> and will start producing views. Now we'll have the old coordinator producing 
> views at the same time as the new one.
> *When this ticket is complete*, Geode will offer a locator startup mode (via 
> TBD configuration parameter, {{LocatorLauncher}} startup parameter) that 
> prevents that locator from becoming a coordinator. With that mode, it will be 
> possible for an administrator to avoid many of the problematic scenarios 
> mentioned above, while retaining the ability to start a first locator which 
> is allowed to become a coordinator.
> For purposes of discussion we'll call the startup mode that allows the 
> locator to become a coordinator "seed" mode, and we'll call the new startup 
> mode that prevents the locator from becoming a coordinator before first 
> joining, "join-only" mode.
> To start a brand new cluster, the first locator is started in "seed" mode. 
> After that all subsequent locators are started in "join only" mode. If 
> network partitions occur, the newly started nodes will exit with a failure 
> status,