Awesome, thank you so much! I completely missed the part "the token range
that it hits will be split", now everything makes sense!

Again, thanks a lot for your help!

Luca


On Wed, Jun 15, 2022 at 1:04 AM Hannu Kröger <hkro...@gmail.com> wrote:

> Adding a token (which in essence is a vnode) means that the token range
> that it hits will be split into two. And that data range which has a new
> owner will be replicated to the new owner node. If there are a lot of
> tokens (=vnodes) in the cluster, adding some amount of vnodes (e.g.
> num_tokens=16) is going to affect that amount (e.g. 16) of existing ranges
> but if there are a lot of tokens, each range is relatively small and
> distributed across the cluster.
>
>
> A very naive example:
> Cluster has 100 nodes and 100GB data with replication factor=3 => 300GB
> data altogether. Each node will have ~3GB data. num_tokens is let’s say
> 256. In the cluster there would be 256*100 => 25600 tokens altogether.
> You add one more node and let’s imagine that tokens are perfectly
> distributed, in the future each node will contain 2.97GB of data.
>
> When that new node is joining, those 256 tokens are (hopefully)
> distributed evenly and each of those 100 nodes will replicate ~0.03GB of
> data to that new node so that it will eventually have that 2.97GB of data.
> And the cluster would have 25856 tokens after the scaling out operation.
> And only 256 existing token ranges would be changed, not all 25600 when a
> new node is joining.
>
> So you see that for each node it’s only 30mb to replicate to the new node.
> Not very expensive, right?
>
> In real life, it’s not so precise and all but the basic idea is the same.
>
> Cheers,
> Hannu
>
> On 15. Jun 2022, at 10.32, Luca Rondanini <luca.rondan...@gmail.com>
> wrote:
>
> Thanks a lot Hannu,
>
> really helpful! But isn't that crazy expensive? adding a vnode means that
> every vnode in the cluster will have a different range of tokens which
> means a lot of data will need to be moved around.
>
> Thanks again,
> Luca
>
>
>
> On Wed, Jun 15, 2022 at 12:25 AM Hannu Kröger <hkro...@gmail.com> wrote:
>
>> When a node joins a cluster, it gets (semi-)random tokens based on
>> num_tokens value.
>>
>> Total amount of vnodes is not fixed. I don’t remember top of my hat if
>> num_tokens can be different on each node but whenever you add a node, new
>> vnodes get “created”. Existing token ranges will be split and some range
>> will be allocated for the new node and data is being replicated to the
>> joining node. So if you have num_tokens set to a higher value like 16 or
>> so, adding and removing a single node in a cluster is standard operation
>> and although it causes some load on the cluster, it should be somewhat
>> evenly distributed among other nodes. If you have just a single token per
>> node then scaling up or down has a bit different effects due to balancing
>> issues etc. So there is a reason why default num_tokens is 16 currently.
>>
>> Cheers,
>> Hannu
>>
>> On 15. Jun 2022, at 10.12, Luca Rondanini <luca.rondan...@gmail.com>
>> wrote:
>>
>> ok, that makes sense, but does the partitioner add vnodes? is the number
>> of vnodes fixed in a cluster?
>>
>> On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger <hkro...@gmail.com> wrote:
>>
>>> Hey,
>>>
>>> num_tokens is tokens per node.
>>>
>>> So in your case you would have 15 vnodes altogether.
>>>
>>> Cheers,
>>> Hannu
>>>
>>> > On 15. Jun 2022, at 10.08, Luca Rondanini <luca.rondan...@gmail.com>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I'm just trying to understand better how cassandra works.
>>> >
>>> > My understanding is that, once set, the number of vnodes does not
>>> change in a cluster. The partitioner allocates vnodes to nodes ensuring
>>> replication data are not stored on the same node.
>>> >
>>> > But what happens if there are more nodes than vnodes? If I set
>>> num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes
>>> and moves data around but it seems an extremely expensive operation. I'm
>>> sure I'm missing something, I'm not quite sure what! :)
>>> >
>>> > Thanks,
>>> > Luca
>>> >
>>>
>>>
>>
>

Reply via email to