You shouldn't need to change num_tokens at all.  num_tokens helps you
pretend your cluster is a bigger than it is and randomly selects tokens for
you so that your data is approximately evenly distributed. As you add more
hosts, it should balance out automatically.

The alternative to num_tokens is to use a single token and explicitly
calculate it each time to ensure the cluster is properly balanced, and then
using `nodetool move` each time you add hosts to the cluster to
re-distribute load. num_tokens makes it less likely that you end up
imbalanced, so you shouldn't need to move any tokens manually.



On Wed, Jun 15, 2022 at 12:34 AM Marc Hoppins <marc.hopp...@eset.com> wrote:

> Hi all,
>
> Say we have 2 datacentres with 12 nodes in each. All hardware is the same.
>
> 4-core, 2 x HDD (eg, 4TiB)
>
> num_tokens = 16 as a start point
>
> If a plan is to gradually increase the nodes per DC, and new hardware will
> have more of everything, especially storage, I assume I increase the
> num_tokens value.  Should I have started with a lower value?
>
> What would be considered as a good adjustment for:
>
> Any increase in number of HDD for any node?
>
> Any increase in capacity per HDD for any node?
>
> Is there any direct correlation between new token count and the
> proportional increase in either quantity of devices or total capacity, or
> is any adjustment purely arbitrary just to differentiate between varied
> nodes?
>
> Thanks
>
> M
>

Reply via email to