[jira] [Commented] (CASSANDRA-14265) Add explanation of vNodes to online documentation

Kenneth Brotman (JIRA) Thu, 01 Mar 2018 05:48:50 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382011#comment-16382011
 ]


Kenneth Brotman commented on CASSANDRA-14265:
---------------------------------------------

>From DataStax at 
>[https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configVnodes.html]


h1. Virtual node (vnode) configuration
A description of virtual nodes (vnodes) and how to use them in different types 
of datacenters. Also steps for disabling vnodes.

Virtual nodes simplify many tasks in DataStax Enterprise, such as eliminating 
the need to determine the partition range (calculate and assign tokens), 
rebalancing the cluster when adding or removing nodes, and replacing dead 
nodes. For a complete description of virtual nodes and how they work, see 
[Virtual 
nodes|https://issues.apache.org/en/dse/5.1/dse-arch/datastax_enterprise/dbArch/archDataDistributeVnodesUsing.html].
DataStax Enterprise requires the same token architecture on all nodes in a 
datacenter. The nodes must all be vnode-enabled or single-token architecture. 
Across the entire cluster, datacenter architecture can vary. For example, a 
single cluster with: * A transaction-only datacenter running OLTP.
 * A single-token architecture analytics datacenter (no vnodes).
 * A search datacenter with vnodes.
h2. Guidelines for using virtual nodes
Whether virtual nodes (vnodes) are enabled or disabled depends on the initial 
[cassandra.yaml|#configVnodes__cassandrayaml] settings. There are two methods 
of distributing token ranges. DataStax recommends using the allocation 
algorithm. Use the same method on all systems in the datacenter.Allocation 
algorithmOptimizes token range distribution between nodes and racks in the 
datacenter based on the keyspace replication factor 
(allocate_tokens_for_local_replication_factor) of the datacenter. Distributes 
the token ranges proportionately using the num_tokens settings. All systems in 
the datacenter should have the same {{num_token}} settings unless the systems 
performance varies between systems. To distribute more of the workload to the 
higher performance hardware, increase the number of tokens for those systems.
The allocation algorithm efficiently balances the workload using fewer tokens; 
when systems are added to a datacenter, the algorithm maintains the balance. 
Using a higher number of tokens more evenly distributes the workload, but also 
significantly increases token management overhead.
CAUTION:
When adding multiple nodes to the cluster using the allocation algorithm, 
ensure that nodes are added one at a time. If nodes are added concurrently, the 
algorithm assigns the same tokens to different nodes.
DataStax recommends using *8* vnodes (tokens). This distributes the workload 
between systems with a ~10% variance and has minimal impact on performance. Set 
the number of vnode tokens ({{num_tokens}}) based on the workload distribution 
requirements of the datacenter:
|Allocation algorithm workload distribution variance|
||Replication factor||4 vnode (tokens)||8 vnode (tokens)||64 vnode 
(tokens)||128 vnode (tokens)||
|2|~17.5%|~12.5%|~3%|~1%|
|3|~14%|~10%|~2%|~1%|
|5|~11%|~7%|~1%|~1%|
h2. Enabling vnodes

In the [cassandra.yaml|#configVnodes__cassandrayaml] file:
 # Uncomment num_tokens and set the required number of tokens.
 # (Recommended) To use the allocation algorithm uncomment 
allocate_tokens_for_local_replication_factor and set it to the target 
replication factor for the keyspaces in the datacenter. If the replication 
varies, alternate between the replication factor (RF) settings.
 # Comment out the initial_token or leave unset.

To upgrade existing clusters to vnodes, see Enabling virtual nodes on an 
existing production cluster.
h2. Disabling vnodes
Important: If you do not use vnodes, you must make sure that each node is 
responsible for roughly an equal amount of data. To ensure that each node is 
responsible for an equal amount of data, assign each node an initial-token 
value and calculate the tokens for each datacenter as described in [Generating 
tokens|https://issues.apache.org/jira/production/calcTokens.html]. # In the 
[cassandra.yaml|#configVnodes__cassandrayaml] file:
 ## Comment out the num_tokens and allocate_tokens_for_local_replication_factor.
 ## Uncomment the initial_token and set it to 1 or to the value of a [generated 
token|https://issues.apache.org/jira/production/calcTokens.html] for a 
multi-node cluster.
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:
|Package installations
Installer-Services installations|/etc/dse/cassandra/cassandra.yaml|
|Tarball installations
Installer-No Services 
installations|installation_location/resources/cassandra/conf/cassandra.yaml|
 * 
 **  

 ** [Configuring Virtual 
Nodes|https://issues.apache.org/datastax_enterprise/config/configVnodeTOC.html]
Topics about setting up and enabling virtual nodes (vnodes).

 *** [Virtual node (vnode) 
configuration|https://issues.apache.org/datastax_enterprise/config/configVnodes.html]
A description of virtual nodes (vnodes) and how to use them in different types 
of datacenters. Also steps for disabling vnodes.

 *** [Enabling virtual nodes on an existing production 
cluster|https://issues.apache.org/datastax_enterprise/config/configVnodesProduction.html]
Steps and recommendations for enabling virtual nodes (vnodes) on an existing 
production cluster.

 ** [Logging 
configuration|https://issues.apache.org/datastax_enterprise/config/configLogginTOC.html]
Topics about changing logging locations, setting logging levels, archiving 
logs, and CDC logging.
[Configuring Virtual 
Nodes|https://issues.apache.org/datastax_enterprise/config/configVnodeTOC.html] 
[Enabling virtual nodes on an existing production 
cluster|https://issues.apache.org/datastax_enterprise/config/configVnodesProduction.html]
 
 
 

> Add explanation of vNodes to online documentation
> -------------------------------------------------
>
>                 Key: CASSANDRA-14265
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14265
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Documentation and Website
>            Reporter: Kenneth Brotman
>            Priority: Major
>
> A lot of inquiries on the mailing list about how vNodes work and how to set 
> configuration properly.  We should add an explanation to the documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-14265) Add explanation of vNodes to online documentation

Reply via email to