On 3 Feb 2018 02:42, "Kyrylo Lebediev" <kyrylo_lebed...@epam.com> wrote:
In my case I'll need to replace all nodes in the cluster (one-by-one), so
streaming will introduce perceptible overhead.
My question is not about data movement/copy itself, but more about all this
Okay, let's say we stopped old node, moved data to new node.
Once it's started with auto_bootstrap=false it will be added to the cluster
like an usual node, just skipping streaming stage, right?
For a cluster with vnodes enabled, during addition of new node its token
ranges are calculated automatically by C* on startup.
So, how will C* know that this new node must be responsible for exactly the
same token ranges as the old node was?
How would the rest of nodes in the cluster ('peers') figure out that old
node should be replaced in ring by the new one?
Do you know about some limitation for this process in case of C* 2.1.x
with vnodes enabled?
A node stores its tokens and host id in the system.local table. Next time
it starts up, it will use the same tokens as previously and the host id
allows the rest of the cluster to see that it is the same node and ignore
the IP address change. This happens regardless of auto_bootstrap setting.
Try "select * from system.local" to see what is recorded for the old node.
When the new node starts up it should log "Using saved tokens" with the
list of numbers. Other nodes should log something like "ignoring IP address
change" for the affected node addresses.
Be careful though, to make sure that you put the data directory exactly
where the new node expects to find it: otherwise it might just join as a
brand new one, allocating new tokens. As a precaution it helps to ensure
that the system user running the Cassandra process has no permission to
create the data directory: this should stop the startup in case of