[ 
https://issues.apache.org/jira/browse/CASSANDRA-21026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-21026:
----------------------------------------
    Description: 
If the address of a decommissioned node is re-used by another new node at some 
later time, any node in the cluster with Accord enabled will be unable to start 
up, including the new node.

As the new node comes up and registers with the {{ClusterMetadataService}} it 
is added to the {{Directory}}.
The decommissioned node's details are also preserved in the directory present 
to ensure that transactions which were in-flight can be completed after the 
node has left.
(https://issues.apache.org/jira/browse/CASSANDRA-20142)

During AccordService initialization building the endpoint mapping will fail 
because of this check in {{EndpointMapping.Builder}}:
{code}
Invariants.requireArgument(!mapping.containsValue(endpoint), "Mapping already 
exists for %s", endpoint);
{code}

Additionally, it seems possible that the wrong method is being called in 
{{AccordTopology::directoryToEndpointMapping}}
{code}
       // There are cases where nodes are removed from the cluster (host 
replacement, decom, etc.), but inflight events
       // may still be happening; keep the ids around so pending events do not 
fail with a mapping error
       for (Directory.RemovedNode removedNode : directory.removedNodes())
           builder.add(removedNode.endpoint, tcmIdToAccord(removedNode.id));
{code}
which should probably call {{builder::removed}} rather than {{builder::add}} 
but that also contains the the same invariant check.

> Reusing the address of a removed node is not possible with Accord enabled
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21026
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21026
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Accord, Cluster/Membership
>            Reporter: Sam Tunnicliffe
>            Priority: Normal
>
> If the address of a decommissioned node is re-used by another new node at 
> some later time, any node in the cluster with Accord enabled will be unable 
> to start up, including the new node.
> As the new node comes up and registers with the {{ClusterMetadataService}} it 
> is added to the {{Directory}}.
> The decommissioned node's details are also preserved in the directory present 
> to ensure that transactions which were in-flight can be completed after the 
> node has left.
> (https://issues.apache.org/jira/browse/CASSANDRA-20142)
> During AccordService initialization building the endpoint mapping will fail 
> because of this check in {{EndpointMapping.Builder}}:
> {code}
> Invariants.requireArgument(!mapping.containsValue(endpoint), "Mapping already 
> exists for %s", endpoint);
> {code}
> Additionally, it seems possible that the wrong method is being called in 
> {{AccordTopology::directoryToEndpointMapping}}
> {code}
>        // There are cases where nodes are removed from the cluster (host 
> replacement, decom, etc.), but inflight events
>        // may still be happening; keep the ids around so pending events do 
> not fail with a mapping error
>        for (Directory.RemovedNode removedNode : directory.removedNodes())
>            builder.add(removedNode.endpoint, tcmIdToAccord(removedNode.id));
> {code}
> which should probably call {{builder::removed}} rather than {{builder::add}} 
> but that also contains the the same invariant check.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to