[
https://issues.apache.org/jira/browse/CASSANDRA-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381968#comment-16381968
]
Kenneth Brotman edited comment on CASSANDRA-14265 at 3/1/18 1:08 PM:
---------------------------------------------------------------------
>From the Dynamo paper:
The basic consistent hashing algorithm presents some challenges. First, the
random position assignment of each node on the ring leads to non-uniform data
and load distribution. Second, the basic algorithm is oblivious to the
heterogeneity in the performance of nodes. To address these issues, Dynamo uses
a variant of consistent hashing (similar to the one used in [10, 20]): instead
of mapping a node to a single point in the circle, each node gets assigned to
multiple points in the ring. To this end, Dynamo uses the concept of “virtual
nodes”. A virtual node looks like a single node in the system, but each node
can be responsible for more than one virtual node. Effectively, when a new node
is added to the system, it is assigned multiple positions (henceforth,
“tokens”) in the ring.
Using virtual nodes has the following advantages:
• If a node becomes unavailable (due to failures or routine maintenance), the
load handled by this node is evenly dispersed across the remaining available
nodes.
• When a node becomes available again, or a new node is added to the system,
the newly available node accepts a roughly equivalent amount of load from each
of the other available nodes.
• The number of virtual nodes that a node is responsible can decided based on
its capacity, accounting for heterogeneity in the physical infrastructure
was (Author: kenbrotman):
>From the Dynamo paper:
The basic consistent hashing algorithm presents some challenges. First, the
random position assignment of each node on the ring leads to non-uniform data
and load distribution. Second, the basic algorithm is oblivious to the
heterogeneity in the performance of nodes. To address these issues, Dynamo uses
a variant of consistent hashing (similar to the one used in [10, 20]): instead
of mapping a node to a single point in the circle, each node gets assigned to
multiple points in the ring. To this end, Dynamo uses the concept of “virtual
nodes”. A virtual node looks like a single node in the system, but each node
can be responsible for more than one virtual node. Effectively, when a new node
is added to the system, it is assigned multiple positions (henceforth,
“tokens”) in the ring. The process of fine-tuning Dynamo’s partitioning scheme
is discussed in Section 6. Using virtual nodes has the following advantages: •
If a node becomes unavailable (due to failures or routine maintenance), the
load handled by this node is evenly dispersed across the remaining available
nodes. • When a node becomes available again, or a new node is added to the
system, the newly available node accepts a roughly equivalent amount of load
from each of the other available nodes. • The number of virtual nodes that a
node is responsible can decided based on its capacity, accounting for
heterogeneity in the physical infrastructure
> Add explanation of vNodes to online documentation
> -------------------------------------------------
>
> Key: CASSANDRA-14265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14265
> Project: Cassandra
> Issue Type: Improvement
> Components: Documentation and Website
> Reporter: Kenneth Brotman
> Priority: Major
>
> A lot of inquiries on the mailing list about how vNodes work and how to set
> configuration properly. We should add an explanation to the documentation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]