Sam Tunnicliffe created CASSANDRA-19488:
-------------------------------------------

             Summary: Ensure snitches always defer to ClusterMetadata
                 Key: CASSANDRA-19488
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19488
             Project: Cassandra
          Issue Type: Improvement
          Components: Cluster/Membership, Messaging/Internode, Transactional 
Cluster Metadata
            Reporter: Sam Tunnicliffe
            Assignee: Sam Tunnicliffe


Internally, C* always uses {{ClusterMetadata}} as the source of topology 
information when calculating data placements, replica plans etc and as such the 
role of the snitch has been somewhat reduced. 

Sorting and comparison functions as provided by specialisations like 
{{DynamicEndpointSnitch}} are still used, but the snitch should only be 
responsible for providing the DC and rack for a new node when it first joins a 
cluster.

Aside from initial startup and registration, snitch implementations should 
always defer to {{{}ClusterMetadata{}}}, for DC and rack otherwise there is a 
risk that the snitch config drifts out of sync with TCM and output from tools 
like {{nodetool ring}} and {{gossipinfo}} becomes incorrect.

A complication is that topology is used when opening connections to peers as 
certain internode connection settings are variable at the DC level, so at the 
time of connecting we want to check the location of the remote peer. Usually, 
this is available from {{{}ClusterMetadata{}}}, but in the case of a brand new 
node joining the cluster nothing is known a priori. The current implementation 
assumes that the snitch will know the location of the new node ahead of time, 
but in practice this is often not the case (though with variants of 
{{PropertyFileSnitch}} it _should_ be), and the remote node is temporarily 
assigned a default DC. This is problematic as it can cause the internode 
connection settings which depend on DC to be incorrectly set. Internode 
connections are long lived and any established while the DC is unknown 
(potentially with incorrect config) will persist indefinitely. This particular 
issue is not directly related to TCM and is present in earlier versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to