Re: Change num_tokens in a live cluster
Hi, On Thu, 16 May 2024, 17:40 Bowen Song via user, wrote: > Replacing nodes one by one in the existing DC is not the same as replacing > an entire DC. > > For example, if you change from 256 vnodes to 4 vnodes on a 100 nodes > single DC cluster. Before you start, each node owns ~1% of the cluster's > data. But after changing 99 nodes, the last remaining node will own ~39% of > the cluster's data. Will that node have enough storage and computing > capacity to handle that? Unless you have significantly over-provisioned > node size, the answer is definitely no. The way to work around this is to > gradually reduce the vnodes number. E.g. reducing from 256 to 128 will > require the last node to have 2x the capacity, which is much more doable > than 39x. To do it this way, you will need to repeat the process to reduce > vnodes number from 256 to 128, then to 64, 32, 16, 8 and finally 4. > > So, the most significant difference is, how many times do the data need to > be moved? > Thank you for the explanation, this will help others think about it when they search about changing num_tokens... :) I am aware about it, but in my current case there are only 4 nodes, with a total of maybe ~25GB of data. So, creation of a new DC is more hassle for me than replace nodes one-by-one. My question was whether there is a simpler solution. And it looks like there is no... :( Bye, Gábor AUTH
Re: Change num_tokens in a live cluster
Replacing nodes one by one in the existing DC is not the same as replacing an entire DC. For example, if you change from 256 vnodes to 4 vnodes on a 100 nodes single DC cluster. Before you start, each node owns ~1% of the cluster's data. But after changing 99 nodes, the last remaining node will own ~39% of the cluster's data. Will that node have enough storage and computing capacity to handle that? Unless you have significantly over-provisioned node size, the answer is definitely no. The way to work around this is to gradually reduce the vnodes number. E.g. reducing from 256 to 128 will require the last node to have 2x the capacity, which is much more doable than 39x. To do it this way, you will need to repeat the process to reduce vnodes number from 256 to 128, then to 64, 32, 16, 8 and finally 4. So, the most significant difference is, how many times do the data need to be moved? On 16/05/2024 15:54, Gábor Auth wrote: Hi, On Thu, 16 May 2024, 10:37 Bowen Song via user, wrote: You can also add a new DC with the desired number of nodes and num_tokens on each node with auto bootstrap disabled, then rebuild the new DC from the existing DC before decommission the existing DC. This method only needs to copy data once, and can copy from/to multiple nodes concurrently, therefore is significantly faster, at the cost of doubling the number of nodes temporarily. For me it's easier the replacement of nodes one-by-one in the same DC, so that, no any new technique... :) Thanks, Gábor AUTH
Re: Change num_tokens in a live cluster
Hi, On Thu, 16 May 2024, 16:55 Jon Haddad, wrote: > Unless your cluster is very small, using the method of adding / removing > nodes will eventually result in putting a much larger portion of your > dataset on a very few number of nodes. I *highly* discourage this. > It has ~15 GB data on one-one node and it has only 4 nodes, so, I name it very small. :) Bye, Gábor AUTH
Re: Change num_tokens in a live cluster
Hi, On Thu, 16 May 2024, 10:37 Bowen Song via user, wrote: > You can also add a new DC with the desired number of nodes and num_tokens > on each node with auto bootstrap disabled, then rebuild the new DC from the > existing DC before decommission the existing DC. This method only needs to > copy data once, and can copy from/to multiple nodes concurrently, therefore > is significantly faster, at the cost of doubling the number of nodes > temporarily. > For me it's easier the replacement of nodes one-by-one in the same DC, so that, no any new technique... :) Thanks, Gábor AUTH
Re: Change num_tokens in a live cluster
Unless your cluster is very small, using the method of adding / removing nodes will eventually result in putting a much larger portion of your dataset on a very few number of nodes. I *highly* discourage this. The only correct, safe path is Bowen's suggestion of adding another DC and decommissioning the old one. Jon On Thu, May 16, 2024 at 1:37 AM Bowen Song via user < user@cassandra.apache.org> wrote: > You can also add a new DC with the desired number of nodes and num_tokens > on each node with auto bootstrap disabled, then rebuild the new DC from the > existing DC before decommission the existing DC. This method only needs to > copy data once, and can copy from/to multiple nodes concurrently, therefore > is significantly faster, at the cost of doubling the number of nodes > temporarily. > On 16/05/2024 09:21, Gábor Auth wrote: > > Hi. > > Is there a newer/easier workflow to change num_tokens in an existing > cluster than add a new node to the cluster with the other num_tokens value > and decommission an old one, repeat and rinse through all nodes? > > -- > Bye, > Gábor AUTH > >
Re: Change num_tokens in a live cluster
You can also add a new DC with the desired number of nodes and num_tokens on each node with auto bootstrap disabled, then rebuild the new DC from the existing DC before decommission the existing DC. This method only needs to copy data once, and can copy from/to multiple nodes concurrently, therefore is significantly faster, at the cost of doubling the number of nodes temporarily. On 16/05/2024 09:21, Gábor Auth wrote: Hi. Is there a newer/easier workflow to change num_tokens in an existing cluster than add a new node to the cluster with the other num_tokens value and decommission an old one, repeat and rinse through all nodes? -- Bye, Gábor AUTH
Change num_tokens in a live cluster
Hi. Is there a newer/easier workflow to change num_tokens in an existing cluster than add a new node to the cluster with the other num_tokens value and decommission an old one, repeat and rinse through all nodes? -- Bye, Gábor AUTH