[ 
https://issues.apache.org/jira/browse/CASSANDRA-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381968#comment-16381968
 ] 

Kenneth Brotman edited comment on CASSANDRA-14265 at 3/1/18 1:08 PM:
---------------------------------------------------------------------

>From the Dynamo paper:

The basic consistent hashing algorithm presents some challenges. First, the 
random position assignment of each node on the ring leads to non-uniform data 
and load distribution. Second, the basic algorithm is oblivious to the 
heterogeneity in the performance of nodes. To address these issues, Dynamo uses 
a variant of consistent hashing (similar to the one used in [10, 20]): instead 
of mapping a node to a single point in the circle, each node gets assigned to 
multiple points in the ring. To this end, Dynamo uses the concept of “virtual 
nodes”. A virtual node looks like a single node in the system, but each node 
can be responsible for more than one virtual node. Effectively, when a new node 
is added to the system, it is assigned multiple positions (henceforth, 
“tokens”) in the ring. 

Using virtual nodes has the following advantages:

• If a node becomes unavailable (due to failures or routine maintenance), the 
load handled by this node is evenly dispersed across the remaining available 
nodes.

• When a node becomes available again, or a new node is added to the system, 
the newly available node accepts a roughly equivalent amount of load from each 
of the other available nodes. 

• The number of virtual nodes that a node is responsible can decided based on 
its capacity, accounting for heterogeneity in the physical infrastructure


was (Author: kenbrotman):
>From the Dynamo paper:

The basic consistent hashing algorithm presents some challenges. First, the 
random position assignment of each node on the ring leads to non-uniform data 
and load distribution. Second, the basic algorithm is oblivious to the 
heterogeneity in the performance of nodes. To address these issues, Dynamo uses 
a variant of consistent hashing (similar to the one used in [10, 20]): instead 
of mapping a node to a single point in the circle, each node gets assigned to 
multiple points in the ring. To this end, Dynamo uses the concept of “virtual 
nodes”. A virtual node looks like a single node in the system, but each node 
can be responsible for more than one virtual node. Effectively, when a new node 
is added to the system, it is assigned multiple positions (henceforth, 
“tokens”) in the ring. The process of fine-tuning Dynamo’s partitioning scheme 
is discussed in Section 6. Using virtual nodes has the following advantages: • 
If a node becomes unavailable (due to failures or routine maintenance), the 
load handled by this node is evenly dispersed across the remaining available 
nodes. • When a node becomes available again, or a new node is added to the 
system, the newly available node accepts a roughly equivalent amount of load 
from each of the other available nodes.  • The number of virtual nodes that a 
node is responsible can decided based on its capacity, accounting for 
heterogeneity in the physical infrastructure

> Add explanation of vNodes to online documentation
> -------------------------------------------------
>
>                 Key: CASSANDRA-14265
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14265
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Documentation and Website
>            Reporter: Kenneth Brotman
>            Priority: Major
>
> A lot of inquiries on the mailing list about how vNodes work and how to set 
> configuration properly.  We should add an explanation to the documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to