https://stackoverflow.com/questions/48776589/cassandra-cant-one-use-snapshots-to-rapidly-scale-out-a-cluster/48778179#48778179
So the basic question is, if one records tokens and snapshots from an
existing node, via:
nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs
for the desired node IP
then takes snapshots
then transfers the snapshots to a new node (not yet attached to cluster)
sets up initial_tokens in the yaml
sets up schema to match
then has it join the cluster
Would that allow quick scaleup of nodes/replication of data? I don't care
if the vnode map changes after the initial join, or data starts being
streamed off as it rebalances, as the cluster
Is there an issue if the vnodes tokens for two nodes are identical? Do they
have to be distinct for each node?
Is it that it mucks with the RF since there will be a greater RF than
normal?
Is this just not that practically faster than an sstable load?
Basically, I was wondering if we just use this to double the number of
nodes with identical copies of the node data via snapshots, and then later
on cassandra can pare down which nodes own which data.