[
https://issues.apache.org/jira/browse/CASSANDRA-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paulo Motta updated CASSANDRA-16138:
------------------------------------
Resolution: Fixed
Status: Resolved (was: Triage Needed)
> Refactor Local Ring Management
> ------------------------------
>
> Key: CASSANDRA-16138
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16138
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Cluster/Membership, Feature/Virtual Nodes,
> Legacy/Distributed Metadata
> Reporter: Paulo Motta
> Assignee: Paulo Motta
> Priority: Normal
> Attachments: vnode-lifecyle.png
>
>
> Token ring management is one of the most critical parts of Cassandra, yet one
> of the most overlooked. Some of the problems include but are not limited to:
> * Complexity (ie. [pending range
> calculation|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/TokenMetadata.java#L878])
>
> * Inefficiency (ie. [pending range
> calculation|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/TokenMetadata.java#L878],
>
> [AbstractReplicationStrategy.getAddressReplicas|https://github.com/apache/cassandra/blob/33eada06a6dd3529da644377dba180795f522176/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L233])
> * Prone to race conditions (ie.
> [here|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/ReplicaLayout.java#L198])
> * Poor modularity and consistency (ie. natural replicas computed from
> [NetworkTopologyStrategy|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java]
> and pending replicas computed from
> [TokenMetadata|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/TokenMetadata.java#L1271])
> * Insufficient testing (due to complexity and poor modularity)
> These limitations make it difficult to reliably fix bugs like properly
> supporting node replacement with the same IP address (CASSANDRA-12344), add
> improvements such as safe ring membership changes, support for networking via
> identity instead of IP (CASSANDRA-15823) or add new features such as dynamic
> virtual nodes.
> This ticket aims at refactoring the ring management sub-module (namely
> TokenMetadata and related classes) to address most of its current limitations
> in order to support further improvements and new features.
> Some of the requirements of the proposed refactoring are:
> # Make node-local ring representation fully immutable and snapshottable.
> # Add content-based versioning to uniquely identify a ring snapshot
> throughout the cluster.
> # Make token ring management vnode-centric to support membership operations
> on individual tokens and simplify token assignment calculations.
> # Primarily identify ring endpoints by node ID to decouple a node’s identity
> from its IP address.
> # Add a local publish/subscribe mechanism for ring change notifications, so
> other modules can subscribe to it and receive the newest snapshot of the ring
> after membership changes.
> # Add testing framework to verify correctness of ring membership operations.
> # Ensure the refactored sub-module does not change current behavior via
> comprehensive testing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]