Hi,

Managing `initial_token` by yourself will give you more control over
scale-in and scale-out.
Let's say you have three node cluster with `num_token: 1`

And your initial range looks like:-

Datacenter: datacenter1
==========
Address    Rack        Status State   Load            Owns
 Token

                             3074457345618258602
127.0.0.1  rack1       Up     Normal  98.96 KiB       66.67%
 -9223372036854775808
127.0.0.2  rack1       Up     Normal  98.96 KiB       66.67%
 -3074457345618258603
127.0.0.3  rack1       Up     Normal  98.96 KiB       66.67%
 3074457345618258602

Now let's say you want to scale out the cluster to twice the current
throughput(means you are adding 3 more nodes)

If you are using AWS EBS volumes then you can use the same volumes and spin
three more nodes by selecting midpoints of existing ranges which means your
new nodes are already having data.
Once you have mounted volumes on your new nodes:-
* You need to delete every system table except schema related tables.
* You need to generate system/local table by yourself which has `Bootstrap
state` as completed and schema-version same as other existing nodes.
* You need to remove extra data on all the machines using cleanup commands

This is how you can scale out Cassandra cluster in the minutes. In case you
want to add nodes one by one then you need to write some small tool which
will always figure out the bigger range in the existing cluster and will
split it into the half.

However, I never tested it thoroughly but this should work conceptually. So
here we are taking advantage of the fact that we have volumes(data) for
the new node beforehand so we no need to bootstrap them.

Thanks & Regards,
Varun Barala

On Tue, Oct 2, 2018 at 2:31 PM onmstester onmstester <onmstes...@zoho.com>
wrote:

>
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
> ---- On Mon, 01 Oct 2018 18:36:03 +0330 *Alain RODRIGUEZ
> <arodr...@gmail.com <arodr...@gmail.com>>* wrote ----
>
> Hello again :),
>
> I thought a little bit more about this question, and I was actually
> wondering if something like this would work:
>
> Imagine 3 node cluster, and create them using:
> For the 3 nodes: `num_token: 4`
> Node 1: `intial_token: -9223372036854775808, -4611686018427387905, -2,
> 4611686018427387901`
> Node 2: `intial_token: -7686143364045646507, -3074457345618258604,
> 1537228672809129299, 6148914691236517202`
> Node 3: `intial_token: -6148914691236517206, -1537228672809129303,
> 3074457345618258600, 7686143364045646503`
>
>  If you know the initial size of your cluster, you can calculate the total
> number of tokens: number of nodes * vnodes and use the formula/python
> code above to get the tokens. Then use the first token for the first node,
> move to the second node, use the second token and repeat. In my case there
> is a total of 12 tokens (3 nodes, 4 tokens each)
> ```
> >>> number_of_tokens = 12
> >>> [str(((2**64 / number_of_tokens) * i) - 2**63) for i in
> range(number_of_tokens)]
> ['-9223372036854775808', '-7686143364045646507', '-6148914691236517206',
> '-4611686018427387905', '-3074457345618258604', '-1537228672809129303',
> '-2', '1537228672809129299', '3074457345618258600', '4611686018427387901',
> '6148914691236517202', '7686143364045646503']
> ```
>
>
> Using manual initial_token (your idea), how could i add a new node to a
> long running cluster (the procedure)?
>
>

Reply via email to