[
https://issues.apache.org/jira/browse/CASSANDRA-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382007#comment-16382007
]
Kenneth Brotman commented on CASSANDRA-14265:
---------------------------------------------
>From DataStax' web site at:
>[https://docs.datastax.com/en/dse/5.1/dse-arch/datastax_enterprise/dbArch/archDataDistributeVnodesUsing.html?hl=vnodes]
h1. Virtual nodes
Virtual nodes ({color:#000000}vnodes{color}) distribute data across nodes at a
finer granularity than can be easily achieved using a single-token
architecture. Virtual nodes simplify many tasks in DataStax Enterprise:
* Tokens are automatically calculated and assigned to each node.
* A cluster is automatically rebalanced when adding or removing nodes. When a
node joins the cluster, it assumes responsibility for an even portion of data
from the other nodes in the cluster. If a node fails, the load is spread evenly
across other nodes in the cluster.
* Rebuilding a dead node is faster because it involves every other node in the
cluster.
* The proportion of {color:#000000}vnodes{color} assigned to each machine in a
cluster can be assigned, so smaller and larger computers can be used in
building a cluster.
To convert an existing single-token architecture cluster to
{color:#000000}vnodes{color}, see [Enabling virtual nodes on an existing
production
cluster|https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configVnodesProduction.html].
h2. Distributing data using {color:#000000}vnodes{color}
In single-token architecture clusters, you must calculate and assign a [single
token|https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/production/singleTokenArchitecture.html]
to each node in a cluster. Each token determines the node's position in the
cluster (or, ring) and its portion of data according to its hash value.
{color:#000000}Vnodes{color} allow each node to own a large number of small
[partition
ranges|https://docs.datastax.com/en/glossary/doc/glossary/gloss_partition_range.html]
distributed throughout the cluster. Although {color:#000000}vnodes{color} use
consistent hashing to distribute data, using them doesn't require token
generation and assignment.
The top portion of the graphic shows a cluster without
{color:#000000}vnodes{color}. In the single-token architecture paradigm, each
node is assigned a single token that represents a location in the ring. Each
node stores data determined by mapping the [partition
key|https://docs.datastax.com/en/glossary/doc/glossary/gloss_partition_key.html]
to a token value within a range from the previous node to its assigned value.
Each node also contains copies of each row from other nodes in the cluster. For
example, if the replication factor is 3, range E replicates to nodes 5, 6, and
1. A node owns exactly one contiguous partition range in the ring space.
The bottom portion of the graphic shows a ring with
{color:#000000}vnodes{color}. Within a cluster, {color:#000000}vnodes{color}
are randomly selected and non-contiguous. The placement of a row is determined
by the hash of the partition key within many smaller partition ranges belonging
to each node.
> Add explanation of vNodes to online documentation
> -------------------------------------------------
>
> Key: CASSANDRA-14265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14265
> Project: Cassandra
> Issue Type: Improvement
> Components: Documentation and Website
> Reporter: Kenneth Brotman
> Priority: Major
>
> A lot of inquiries on the mailing list about how vNodes work and how to set
> configuration properly. We should add an explanation to the documentation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]