Re: Cassandra migration process

Bowen Song Mon, 11 Apr 2022 11:06:55 -0700

Paul is right. It's generally better to setup a new DC and thendecommission the existing DC.

However, if the network latency is not a concern, and the cost ofrunning two DCs in parallel is prohibitively high, you could do node bynode replacement assuming the settings in the cassandra.yaml arecompatible. Pay attention to endpoint_snitch.

If you are going to do this, it's better to use"-Dcassandra.replace_address_first_boot=..." instead of repeatedlydecommissioning and adding nodes. Note: this require the same num_tokenson the old and the replacement node.

Decommissioning and adding nodes will shuffle the token ring, whichcould lead to unnecessary streaming activities and excessive disk spaceusage. Although the disk space can be reclaimed by "nodetool cleanup",you may find yourself need to run that frequently while moving nodes inorder to avoid running out of disk space on other nodes.

Seehttps://cassandra.apache.org/doc/3.11/cassandra/operating/topo_changes.html#replacing-a-dead-nodefor the process of replacing a dead node. Tips: shutting down a livenode to turn it into a dead node.

See alsohttps://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsReplaceNode.htmlfor a detailed description of the process, but keep in mind that this iswritten for an older version and uses "-Dcassandra.replace_address"instead of the new "-Dcassandra.replace_address_first_boot" option. Bothwill work, but the new option is generally preferred, because itminimizes the risk of messing up the cluster if you forgot to remove itafter the node has fully joined the cluster.


On 11/04/2022 11:43, Paul Chandler wrote:

I would recommend creating a second Cassandra Datacenter for thecluster, rather than single nodes in the same DC, this is likely tocause latency issues, due to quorum queries being across datacenters.
We did this several times, moving from Rackspace to GCP, this is alldocumented in 3 blog posts starting here:https://www.redshots.com/moving-cassandra-clusters-without-downtime-part-1/
If you have any further questions let me know.

Thanks

Paul
On 11 Apr 2022, at 10:52, Germain MAURICE<germain.maur...@oscaro.com> wrote:
Hello,
In my company we are working on migrating our cassandra cluster froma provider to another one, we plan to migrate the data adding a nodeand decommissioning an old one.We would like to throttle the bandwith used between the bothproviders to preserve the capacity of the link.We would to like to confirm if the process of migration andthroughput throttling we plan is the right one.
The plan is the following :

  * installing a new node on gcp
  * setting streamthroughput on each on-premise node (3 nodes) that
    will ensure we don’t use more than 3 * streamthrougput of
    bandwith of the link between the both provider
  * launch ˋnodetool decommission` on an on-premise node
  * wait for the end of the decommission

Is that right ?
Thank you for your answer.

Re: Cassandra migration process

Reply via email to