Since CEP-21, the source of truth for topology info (a node's datacenter & rack) is ClusterMetadata. Each node provides its dc/rack when it registers itself with the cluster prior to joining and this information is effectively immutable (for now). This significantly reduces the scope of IEndpointSnitch's responsibilities and CASSANDRA-19488 proposes a refactoring which breaks out the remaining functionality into a handful of new providers (full details can be found in the JIRA).
This is one of the more widely used extension points in Cassandra, so we wanted to bring it to the mailing list in addition to discussing on JIRA. To be clear, no operator intervention should be necessary when upgrading. To ease migration onto the new config and to allow us to deprecate snitches in a controlled way, it will remain fully supported to configure nodes using the endpoint_snitch setting in yaml. A SnitchAdapter acts as a facade in this case, presenting the new interfaces to calling code while delegating to the legacy snitch. Most of the in-tree snitches have been refactored to extract implementations of the new interfaces so that their functionality can be used via the new configuration. Some questions for the list: * We have added 2 new methods to IEndpointSnitch, which have essentially been pulled up from Ec2MultiRegionSnitch and GossipingPropertyFileSnitch to support ReconnectableSnitchHelper. Currently, these are added as default methods on the interface so that out-of-tree snitches remain binary compatible. However, it would be safer to break binary compatibility in this case to ensure that any custom snitches out in the wild must be updated and their behaviour is preserved. So the question is, would there be objections to extending the (now deprecated) IEndpointSnitch interface in this way? * Python dtests and config are currently unchanged (aside from some error message checks) so these are exercising the path whereby the clusters are configured with endpoint_snitch and make use of the compatibility adapter. In-jvm upgrade dtests switch from old to new style configuration on upgrade to 5.1 (though in truth, these don't exercise snitches much at all as a special dtest snitch is used throughout). cassandra-latest.yaml contains the new settings, while cassandra.yaml and the variations in test/conf retain the old style settings. How should we approach updating these configs so that we maintain a balance between test coverage, compatibility during upgrades and encouraging the use of new style config in fresh clusters?