Paul is right. It's generally better to setup a new DC and then
decommission the existing DC.
However, if the network latency is not a concern, and the cost of
running two DCs in parallel is prohibitively high, you could do node by
node replacement assuming the settings in the cassandra.yaml are
compatible. Pay attention to endpoint_snitch.
If you are going to do this, it's better to use
"-Dcassandra.replace_address_first_boot=..." instead of repeatedly
decommissioning and adding nodes. Note: this require the same num_tokens
on the old and the replacement node.
Decommissioning and adding nodes will shuffle the token ring, which
could lead to unnecessary streaming activities and excessive disk space
usage. Although the disk space can be reclaimed by "nodetool cleanup",
you may find yourself need to run that frequently while moving nodes in
order to avoid running out of disk space on other nodes.
See
https://cassandra.apache.org/doc/3.11/cassandra/operating/topo_changes.html#replacing-a-dead-node
for the process of replacing a dead node. Tips: shutting down a live
node to turn it into a dead node.
See also
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsReplaceNode.html
for a detailed description of the process, but keep in mind that this is
written for an older version and uses "-Dcassandra.replace_address"
instead of the new "-Dcassandra.replace_address_first_boot" option. Both
will work, but the new option is generally preferred, because it
minimizes the risk of messing up the cluster if you forgot to remove it
after the node has fully joined the cluster.
On 11/04/2022 11:43, Paul Chandler wrote:
I would recommend creating a second Cassandra Datacenter for the
cluster, rather than single nodes in the same DC, this is likely to
cause latency issues, due to quorum queries being across datacenters.
We did this several times, moving from Rackspace to GCP, this is all
documented in 3 blog posts starting here:
https://www.redshots.com/moving-cassandra-clusters-without-downtime-part-1/
If you have any further questions let me know.
Thanks
Paul
On 11 Apr 2022, at 10:52, Germain MAURICE
<germain.maur...@oscaro.com> wrote:
Hello,
In my company we are working on migrating our cassandra cluster from
a provider to another one, we plan to migrate the data adding a node
and decommissioning an old one.
We would like to throttle the bandwith used between the both
providers to preserve the capacity of the link.
We would to like to confirm if the process of migration and
throughput throttling we plan is the right one.
The plan is the following :
* installing a new node on gcp
* setting streamthroughput on each on-premise node (3 nodes) that
will ensure we don’t use more than 3 * streamthrougput of
bandwith of the link between the both provider
* launch ˋnodetool decommission` on an on-premise node
* wait for the end of the decommission
Is that right ?
Thank you for your answer.