Replacing nodes one by one in the existing DC is not the same as replacing an entire DC.

For example, if you change from 256 vnodes to 4 vnodes on a 100 nodes single DC cluster. Before you start, each node owns ~1% of the cluster's data. But after changing 99 nodes, the last remaining node will own ~39% of the cluster's data. Will that node have enough storage and computing capacity to handle that? Unless you have significantly over-provisioned node size, the answer is definitely no. The way to work around this is to gradually reduce the vnodes number. E.g. reducing from 256 to 128 will require the last node to have 2x the capacity, which is much more doable than 39x. To do it this way, you will need to repeat the process to reduce vnodes number from 256 to 128, then to 64, 32, 16, 8 and finally 4.

So, the most significant difference is, how many times do the data need to be moved?


On 16/05/2024 15:54, Gábor Auth wrote:
Hi,

On Thu, 16 May 2024, 10:37 Bowen Song via user, <user@cassandra.apache.org> wrote:

    You can also add a new DC with the desired number of nodes and
    num_tokens on each node with auto bootstrap disabled, then rebuild
    the new DC from the existing DC before decommission the existing
    DC. This method only needs to copy data once, and can copy from/to
    multiple nodes concurrently, therefore is significantly faster, at
    the cost of doubling the number of nodes temporarily.

For me it's easier the replacement of nodes one-by-one in the same DC, so that, no any new technique... :)

Thanks,
Gábor AUTH

Reply via email to