subject:"Generating evenly distributed tokens for vnodes"

Re: Generating evenly distributed tokens for vnodes

2020-05-29 Thread Jon Haddad

I'm on mobile now so I might be mistaken, but I don't think nodetool move
works with multiple tokens

On Fri, May 29, 2020, 1:48 PM Kornel Pal  wrote:

> Hi Anthony,
>
> Thank you very much for looking into using the script for initial token
> generation and for providing multiple detailed methods of expanding the
> cluster.
>
> This helps a lot, indeed.
>
> Regards,
> Kornel
> Anthony Grasso wrote:
>
> Hi Kornel,
>
> Great use of the script for generating initial tokens! I agree that you
> can achieve an optimal token distribution in a cluster using such a method.
>
> One thing to think about is the process for expanding the size of the
> cluster in this case. For example consider the scenario where you wanted to
> insert a single new node into the cluster. To do this you would need to
> calculate what the new token ranges should be for the nodes including the
> new node. You would then need to reassign existing tokens to other nodes
> using 'nodetool move'. You would likely need to call this command a few
> times to do a few movements in order to achieve the newly calculated token
> assignments. Once the "gap" in the token ranges has been created, you would
> then update the initial_token property for the existing nodes in the
> cluster. Finally, you could then insert the new node with the assigned
> tokens.
>
> While the above process could be used to maintain an optimal token
> distribution in a cluster, it does increase operational overhead. This is
> where allocate_tokens_for_keyspace and
> allocate_tokens_for_local_replication_factor (4.0 only) play a critical
> role. They save the operational overhead when changing the size of the
> cluster. In addition, from my experience they do a pretty good job at
> keeping the token ranges evenly distributed when expanding the cluster.
> Even in the case where a low number for num_tokens is used. If expanding
> the cluster size is required during an emergency, using the
> allocate_token_* setting would be the most simple and reliable way to
> quickly insert a node while maintaining reasonable token distribution.
>
> The only other way to expand the cluster and maintain even token
> distribution without using an allocate_token_* setting, is to double the
> size of the cluster each time. Obviously this has its own draw backs in
> terms of increase costs to both money and time compared to inserting a
> single node.
>
> Hope this helps.
>
> Kind regards,
> Anthony
>
> On Thu, 28 May 2020 at 04:52, Kornel Pal  wrote:
>
>> As I understand, the previous discussion is about using
>> allocate_tokens_for_keyspace for allocating tokens for most of the
>> nodes. On the other hand, I am proposing to generate all the tokens for
>> all the nodes using a Python script.
>>
>> This seems to result in perfectly even token ownership distribution
>> across all the nodes for all possible replication factors, thus being an
>> improvement over using allocate_tokens_for_keyspace.
>>
>> Elliott Sims wrote:
>> > There's also a slightly older mailing list discussion on this subject
>> > that goes into detail on this sort of strategy:
>> > https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html
>> >
>> > I've been approximately following it, repeating steps 3-6 for the first
>> > host in each "rack(replica, since I have 3 racks and RF=3) then 8-10
>> for
>> > the remaining hosts in the new datacenter.  So far, so good (sample
>> size
>> > of 1) but it's a pretty painstaking process
>> >
>> > This should get a lot simpler with Cassandra 4+'s
>> > "allocate_tokens_for_local_replication_factor" option, which will
>> > default to 3.
>> >
>> > On Wed, May 27, 2020 at 4:34 AM Kornel Pal > > > wrote:
>> >
>> > Hi,
>> >
>> > Generating ideal tokens for single-token datacenters is well
>> understood
>> > and documented, but there is much less information available on
>> > generating tokens with even ownership distribution when using
>> vnodes.
>> > The best description I could find on token generation for vnodes is
>> >
>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>> >
>> > While allocate_tokens_for_keyspace results in much more even
>> ownership
>> > distribution than random allocation, and does a great job at
>> balancing
>> > ownership when adding new nodes, using it for creating a new
>> datacenter
>> > results in less than ideal ownership distribution.
>> >
>> > After some experimentation, I found that it is possible to generate
>> all
>> > the tokens for a new datacenter with an extended version of the
>> Python
>> > script presented in the above blog post. Using these tokens seem to
>> > result in perfectly even ownership distribution with various
>> > token/node/rack configurations for all possible replication factors.
>> >
>> > Murmur3Partitioner:
>> >   >>> datacenter_offset = 0
>> >   >>> num_tokens = 4
>> >   >>> num_racks =

Re: Generating evenly distributed tokens for vnodes

2020-05-29 Thread Kornel Pal

Hi Anthony,

Thank you very much for looking into using the script for initial token 
generation and for providing multiple detailed methods of expanding the 
cluster.

This helps a lot, indeed.

Regards,
Kornel

Anthony Grasso wrote:

Hi Kornel,

Great use of the script for generating initial tokens! I agree that 
you can achieve an optimal token distribution in a cluster using such 
a method.

One thing to think about is the process for expanding the size of the 
cluster in this case. For example consider the scenario where you 
wanted to insert a single new node into the cluster. To do this you 
would need to calculate what the new token ranges should be for the 
nodes including the new node. You would then need to reassign existing 
tokens to other nodes using 'nodetool move'. You would likely need to 
call this command a few times to do a few movements in order to 
achieve the newly calculated token assignments. Once the "gap" in the 
token ranges has been created, you would then update the initial_token 
property for the existing nodes in the cluster. Finally, you could 
then insert the new node with the assigned tokens.

While the above process could be used to maintain an optimal token 
distribution in a cluster, it does increase operational overhead. This 
is where allocate_tokens_for_keyspace and 
allocate_tokens_for_local_replication_factor (4.0 only) play a 
critical role. They save the operational overhead when changing the 
size of the cluster. In addition, from my experience they do a pretty 
good job at keeping the token ranges evenly distributed when expanding 
the cluster. Even in the case where a low number for num_tokens is 
used. If expanding the cluster size is required during an emergency, 
using the allocate_token_* setting would be the most simple and 
reliable way to quickly insert a node while maintaining reasonable 
token distribution.

The only other way to expand the cluster and maintain even token 
distribution without using an allocate_token_* setting, is to double 
the size of the cluster each time. Obviously this has its own draw 
backs in terms of increase costs to both money and time compared to 
inserting a single node.

Hope this helps.

Kind regards,
Anthony

On Thu, 28 May 2020 at 04:52, Kornel Pal > wrote:

As I understand, the previous discussion is about using
allocate_tokens_for_keyspace for allocating tokens for most of the
nodes. On the other hand, I am proposing to generate all the
tokens for
all the nodes using a Python script.

This seems to result in perfectly even token ownership distribution
across all the nodes for all possible replication factors, thus
being an
improvement over using allocate_tokens_for_keyspace.

Elliott Sims wrote:
> There's also a slightly older mailing list discussion on this
subject
> that goes into detail on this sort of strategy:
> https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html
>
> I've been approximately following it, repeating steps 3-6 for
the first
> host in each "rack(replica, since I have 3 racks and RF=3) then
8-10 for
> the remaining hosts in the new datacenter.  So far, so good
(sample size
> of 1) but it's a pretty painstaking process
>
> This should get a lot simpler with Cassandra 4+'s
> "allocate_tokens_for_local_replication_factor" option, which will
> default to 3.
>
> On Wed, May 27, 2020 at 4:34 AM Kornel Pal mailto:kornel...@gmail.com>
> >> wrote:
>
>     Hi,
>
>     Generating ideal tokens for single-token datacenters is well
understood
>     and documented, but there is much less information available on
>     generating tokens with even ownership distribution when
using vnodes.
>     The best description I could find on token generation for
vnodes is
>

https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>
>     While allocate_tokens_for_keyspace results in much more even
ownership
>     distribution than random allocation, and does a great job at
balancing
>     ownership when adding new nodes, using it for creating a new
datacenter
>     results in less than ideal ownership distribution.
>
>     After some experimentation, I found that it is possible to
generate all
>     the tokens for a new datacenter with an extended version of
the Python
>     script presented in the above blog post. Using these tokens
seem to
>     result in perfectly even ownership distribution with various
>     token/node/rack configurations for all possible replication
factors.
>
>     Murmur3Partitioner:
>       >>> datacenter_offset = 0
>       >>> num_tokens = 4
>       >>> num_racks = 3
>       >>> num_nodes = 3
>

Re: Generating evenly distributed tokens for vnodes

2020-05-28 Thread Anthony Grasso

Hi Kornel,

Great use of the script for generating initial tokens! I agree that you can
achieve an optimal token distribution in a cluster using such a method.

One thing to think about is the process for expanding the size of the
cluster in this case. For example consider the scenario where you wanted to
insert a single new node into the cluster. To do this you would need to
calculate what the new token ranges should be for the nodes including the
new node. You would then need to reassign existing tokens to other nodes
using 'nodetool move'. You would likely need to call this command a few
times to do a few movements in order to achieve the newly calculated token
assignments. Once the "gap" in the token ranges has been created, you would
then update the initial_token property for the existing nodes in the
cluster. Finally, you could then insert the new node with the assigned
tokens.

While the above process could be used to maintain an optimal token
distribution in a cluster, it does increase operational overhead. This is
where allocate_tokens_for_keyspace and
allocate_tokens_for_local_replication_factor (4.0 only) play a critical
role. They save the operational overhead when changing the size of the
cluster. In addition, from my experience they do a pretty good job at
keeping the token ranges evenly distributed when expanding the cluster.
Even in the case where a low number for num_tokens is used. If expanding
the cluster size is required during an emergency, using the
allocate_token_* setting would be the most simple and reliable way to
quickly insert a node while maintaining reasonable token distribution.

The only other way to expand the cluster and maintain even token
distribution without using an allocate_token_* setting, is to double the
size of the cluster each time. Obviously this has its own draw backs in
terms of increase costs to both money and time compared to inserting a
single node.

Hope this helps.

Kind regards,
Anthony

On Thu, 28 May 2020 at 04:52, Kornel Pal  wrote:

> As I understand, the previous discussion is about using
> allocate_tokens_for_keyspace for allocating tokens for most of the
> nodes. On the other hand, I am proposing to generate all the tokens for
> all the nodes using a Python script.
>
> This seems to result in perfectly even token ownership distribution
> across all the nodes for all possible replication factors, thus being an
> improvement over using allocate_tokens_for_keyspace.
>
> Elliott Sims wrote:
> > There's also a slightly older mailing list discussion on this subject
> > that goes into detail on this sort of strategy:
> > https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html
> >
> > I've been approximately following it, repeating steps 3-6 for the first
> > host in each "rack(replica, since I have 3 racks and RF=3) then 8-10 for
> > the remaining hosts in the new datacenter.  So far, so good (sample size
> > of 1) but it's a pretty painstaking process
> >
> > This should get a lot simpler with Cassandra 4+'s
> > "allocate_tokens_for_local_replication_factor" option, which will
> > default to 3.
> >
> > On Wed, May 27, 2020 at 4:34 AM Kornel Pal  > > wrote:
> >
> > Hi,
> >
> > Generating ideal tokens for single-token datacenters is well
> understood
> > and documented, but there is much less information available on
> > generating tokens with even ownership distribution when using vnodes.
> > The best description I could find on token generation for vnodes is
> >
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> >
> > While allocate_tokens_for_keyspace results in much more even
> ownership
> > distribution than random allocation, and does a great job at
> balancing
> > ownership when adding new nodes, using it for creating a new
> datacenter
> > results in less than ideal ownership distribution.
> >
> > After some experimentation, I found that it is possible to generate
> all
> > the tokens for a new datacenter with an extended version of the
> Python
> > script presented in the above blog post. Using these tokens seem to
> > result in perfectly even ownership distribution with various
> > token/node/rack configurations for all possible replication factors.
> >
> > Murmur3Partitioner:
> >   >>> datacenter_offset = 0
> >   >>> num_tokens = 4
> >   >>> num_racks = 3
> >   >>> num_nodes = 3
> >   >>> print "\n".join(['[Rack #{}, Node #{}] initial_token:
> > {}'.format(r
> > + 1, n + 1, ','.join([str(((2**64 / (num_tokens * num_nodes *
> > num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) -
> > 2**63 +
> > datacenter_offset) for t in range(num_tokens)])) for r in
> > range(num_racks) for n in range(num_nodes)])
> > [Rack #1, Node #1] initial_token:
> > -9223372036854775808,-4611686018427387908,-8,4611686018427387892
> > [Rack #1, Node #2] initia

Re: Generating evenly distributed tokens for vnodes

2020-05-27 Thread Kornel Pal

As I understand, the previous discussion is about using 
allocate_tokens_for_keyspace for allocating tokens for most of the 
nodes. On the other hand, I am proposing to generate all the tokens for 
all the nodes using a Python script.

This seems to result in perfectly even token ownership distribution 
across all the nodes for all possible replication factors, thus being an 
improvement over using allocate_tokens_for_keyspace.

Elliott Sims wrote:
There's also a slightly older mailing list discussion on this subject 
that goes into detail on this sort of strategy: 
https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html

I've been approximately following it, repeating steps 3-6 for the first 
host in each "rack(replica, since I have 3 racks and RF=3) then 8-10 for 
the remaining hosts in the new datacenter.  So far, so good (sample size 
of 1) but it's a pretty painstaking process

This should get a lot simpler with Cassandra 4+'s 
"allocate_tokens_for_local_replication_factor" option, which will 
default to 3.

On Wed, May 27, 2020 at 4:34 AM Kornel Pal > wrote:

Hi,

Generating ideal tokens for single-token datacenters is well understood
and documented, but there is much less information available on
generating tokens with even ownership distribution when using vnodes.
The best description I could find on token generation for vnodes is

https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html

While allocate_tokens_for_keyspace results in much more even ownership
distribution than random allocation, and does a great job at balancing
ownership when adding new nodes, using it for creating a new datacenter
results in less than ideal ownership distribution.

After some experimentation, I found that it is possible to generate all
the tokens for a new datacenter with an extended version of the Python
script presented in the above blog post. Using these tokens seem to
result in perfectly even ownership distribution with various
token/node/rack configurations for all possible replication factors.

Murmur3Partitioner:
  >>> datacenter_offset = 0
  >>> num_tokens = 4
  >>> num_racks = 3
  >>> num_nodes = 3
  >>> print "\n".join(['[Rack #{}, Node #{}] initial_token:
{}'.format(r
+ 1, n + 1, ','.join([str(((2**64 / (num_tokens * num_nodes *
num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) -
2**63 +
datacenter_offset) for t in range(num_tokens)])) for r in
range(num_racks) for n in range(num_nodes)])
[Rack #1, Node #1] initial_token:
-9223372036854775808,-4611686018427387908,-8,4611686018427387892
[Rack #1, Node #2] initial_token:

-7686143364045646508,-3074457345618258608,1537228672809129292,6148914691236517192
[Rack #1, Node #3] initial_token:

-6148914691236517208,-1537228672809129308,3074457345618258592,7686143364045646492
[Rack #2, Node #1] initial_token:

-8710962479251732708,-4099276460824344808,512409557603043092,5124095576030430992
[Rack #2, Node #2] initial_token:

-7173733806442603408,-2562047788015215508,2049638230412172392,6661324248839560292
[Rack #2, Node #3] initial_token:

-5636505133633474108,-1024819115206086208,3586866903221301692,8198552921648689592
[Rack #3, Node #1] initial_token:

-8198552921648689608,-3586866903221301708,1024819115206086192,5636505133633474092
[Rack #3, Node #2] initial_token:

-6661324248839560308,-2049638230412172408,2562047788015215492,7173733806442603392
[Rack #3, Node #3] initial_token:

-5124095576030431008,-512409557603043108,4099276460824344792,8710962479251732692

RandomPartitioner:
  >>> datacenter_offset = 0
  >>> num_tokens = 4
  >>> num_racks = 3
  >>> num_nodes = 3
  >>> print "\n".join(['[Rack #{}, Node #{}] initial_token:
{}'.format(r
+ 1, n + 1, ','.join([str(((2**127 / (num_tokens * num_nodes *
num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) +
datacenter_offset) for t in range(num_tokens)])) for r in
range(num_racks) for n in range(num_nodes)])
[Rack #1, Node #1] initial_token:

0,42535295865117307932921825928971026427,85070591730234615865843651857942052854,127605887595351923798765477786913079281
[Rack #1, Node #2] initial_token:

14178431955039102644307275309657008809,56713727820156410577229101238628035236,99249023685273718510150927167599061663,141784319550391026443072753096570088090
[Rack #1, Node #3] initial_token:

28356863910078205288614550619314017618,70892159775195513221536376548285044045,113427455640312821154458202477256070472,155962751505430129087380028406227096899
[Rack #2, Node #1] initial_token:

4726143985013034214769091769885669603,47261439850130342147690917698856696030,89796735715247650080612743627827722457,132332031580364958013534569556798748884
[Rack #2, Node #2] initial_toke

Re: Generating evenly distributed tokens for vnodes

2020-05-27 Thread Elliott Sims

There's also a slightly older mailing list discussion on this subject that
goes into detail on this sort of strategy:
https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html

I've been approximately following it, repeating steps 3-6 for the first
host in each "rack(replica, since I have 3 racks and RF=3) then 8-10 for
the remaining hosts in the new datacenter.  So far, so good (sample size of
1) but it's a pretty painstaking process

This should get a lot simpler with Cassandra 4+'s
"allocate_tokens_for_local_replication_factor" option, which will default
to 3.

On Wed, May 27, 2020 at 4:34 AM Kornel Pal  wrote:

> Hi,
>
> Generating ideal tokens for single-token datacenters is well understood
> and documented, but there is much less information available on
> generating tokens with even ownership distribution when using vnodes.
> The best description I could find on token generation for vnodes is
>
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>
> While allocate_tokens_for_keyspace results in much more even ownership
> distribution than random allocation, and does a great job at balancing
> ownership when adding new nodes, using it for creating a new datacenter
> results in less than ideal ownership distribution.
>
> After some experimentation, I found that it is possible to generate all
> the tokens for a new datacenter with an extended version of the Python
> script presented in the above blog post. Using these tokens seem to
> result in perfectly even ownership distribution with various
> token/node/rack configurations for all possible replication factors.
>
> Murmur3Partitioner:
>  >>> datacenter_offset = 0
>  >>> num_tokens = 4
>  >>> num_racks = 3
>  >>> num_nodes = 3
>  >>> print "\n".join(['[Rack #{}, Node #{}] initial_token: {}'.format(r
> + 1, n + 1, ','.join([str(((2**64 / (num_tokens * num_nodes *
> num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) - 2**63 +
> datacenter_offset) for t in range(num_tokens)])) for r in
> range(num_racks) for n in range(num_nodes)])
> [Rack #1, Node #1] initial_token:
> -9223372036854775808,-4611686018427387908,-8,4611686018427387892
> [Rack #1, Node #2] initial_token:
>
> -7686143364045646508,-3074457345618258608,1537228672809129292,6148914691236517192
> [Rack #1, Node #3] initial_token:
>
> -6148914691236517208,-1537228672809129308,3074457345618258592,7686143364045646492
> [Rack #2, Node #1] initial_token:
>
> -8710962479251732708,-4099276460824344808,512409557603043092,5124095576030430992
> [Rack #2, Node #2] initial_token:
>
> -7173733806442603408,-2562047788015215508,2049638230412172392,6661324248839560292
> [Rack #2, Node #3] initial_token:
>
> -5636505133633474108,-1024819115206086208,3586866903221301692,8198552921648689592
> [Rack #3, Node #1] initial_token:
>
> -8198552921648689608,-3586866903221301708,1024819115206086192,5636505133633474092
> [Rack #3, Node #2] initial_token:
>
> -6661324248839560308,-2049638230412172408,2562047788015215492,7173733806442603392
> [Rack #3, Node #3] initial_token:
>
> -5124095576030431008,-512409557603043108,4099276460824344792,8710962479251732692
>
> RandomPartitioner:
>  >>> datacenter_offset = 0
>  >>> num_tokens = 4
>  >>> num_racks = 3
>  >>> num_nodes = 3
>  >>> print "\n".join(['[Rack #{}, Node #{}] initial_token: {}'.format(r
> + 1, n + 1, ','.join([str(((2**127 / (num_tokens * num_nodes *
> num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) +
> datacenter_offset) for t in range(num_tokens)])) for r in
> range(num_racks) for n in range(num_nodes)])
> [Rack #1, Node #1] initial_token:
>
> 0,42535295865117307932921825928971026427,85070591730234615865843651857942052854,127605887595351923798765477786913079281
> [Rack #1, Node #2] initial_token:
>
> 14178431955039102644307275309657008809,56713727820156410577229101238628035236,99249023685273718510150927167599061663,141784319550391026443072753096570088090
> [Rack #1, Node #3] initial_token:
>
> 28356863910078205288614550619314017618,70892159775195513221536376548285044045,113427455640312821154458202477256070472,155962751505430129087380028406227096899
> [Rack #2, Node #1] initial_token:
>
> 4726143985013034214769091769885669603,47261439850130342147690917698856696030,89796735715247650080612743627827722457,132332031580364958013534569556798748884
> [Rack #2, Node #2] initial_token:
>
> 18904575940052136859076367079542678412,61439871805169444791998193008513704839,103975167670286752724920018937484731266,146510463535404060657841844866455757693
> [Rack #2, Node #3] initial_token:
>
> 33083007895091239503383642389199687221,75618303760208547436305468318170713648,118153599625325855369227294247141740075,160688895490443163302149120176112766502
> [Rack #3, Node #1] initial_token:
>
> 9452287970026068429538183539771339206,51987583835143376362460009468742365633,94522879700260684295381835397713392060,137058175565377992228303661326684418487
> [Rack #3, Node #2] initial_token:
>
> 236307199250651710738454

Generating evenly distributed tokens for vnodes

2020-05-27 Thread Kornel Pal


Hi,

Generating ideal tokens for single-token datacenters is well understood 
and documented, but there is much less information available on 
generating tokens with even ownership distribution when using vnodes. 
The best description I could find on token generation for vnodes is 
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html


While allocate_tokens_for_keyspace results in much more even ownership 
distribution than random allocation, and does a great job at balancing 
ownership when adding new nodes, using it for creating a new datacenter 
results in less than ideal ownership distribution.


After some experimentation, I found that it is possible to generate all 
the tokens for a new datacenter with an extended version of the Python 
script presented in the above blog post. Using these tokens seem to 
result in perfectly even ownership distribution with various 
token/node/rack configurations for all possible replication factors.


Murmur3Partitioner:
>>> datacenter_offset = 0
>>> num_tokens = 4
>>> num_racks = 3
>>> num_nodes = 3
>>> print "\n".join(['[Rack #{}, Node #{}] initial_token: {}'.format(r 
+ 1, n + 1, ','.join([str(((2**64 / (num_tokens * num_nodes * 
num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) - 2**63 + 
datacenter_offset) for t in range(num_tokens)])) for r in 
range(num_racks) for n in range(num_nodes)])
[Rack #1, Node #1] initial_token: 
-9223372036854775808,-4611686018427387908,-8,4611686018427387892
[Rack #1, Node #2] initial_token: 
-7686143364045646508,-3074457345618258608,1537228672809129292,6148914691236517192
[Rack #1, Node #3] initial_token: 
-6148914691236517208,-1537228672809129308,3074457345618258592,7686143364045646492
[Rack #2, Node #1] initial_token: 
-8710962479251732708,-4099276460824344808,512409557603043092,5124095576030430992
[Rack #2, Node #2] initial_token: 
-7173733806442603408,-2562047788015215508,2049638230412172392,6661324248839560292
[Rack #2, Node #3] initial_token: 
-5636505133633474108,-1024819115206086208,3586866903221301692,8198552921648689592
[Rack #3, Node #1] initial_token: 
-8198552921648689608,-3586866903221301708,1024819115206086192,5636505133633474092
[Rack #3, Node #2] initial_token: 
-6661324248839560308,-2049638230412172408,2562047788015215492,7173733806442603392
[Rack #3, Node #3] initial_token: 
-5124095576030431008,-512409557603043108,4099276460824344792,8710962479251732692


RandomPartitioner:
>>> datacenter_offset = 0
>>> num_tokens = 4
>>> num_racks = 3
>>> num_nodes = 3
>>> print "\n".join(['[Rack #{}, Node #{}] initial_token: {}'.format(r 
+ 1, n + 1, ','.join([str(((2**127 / (num_tokens * num_nodes * 
num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) + 
datacenter_offset) for t in range(num_tokens)])) for r in 
range(num_racks) for n in range(num_nodes)])
[Rack #1, Node #1] initial_token: 
0,42535295865117307932921825928971026427,85070591730234615865843651857942052854,127605887595351923798765477786913079281
[Rack #1, Node #2] initial_token: 
14178431955039102644307275309657008809,56713727820156410577229101238628035236,99249023685273718510150927167599061663,141784319550391026443072753096570088090
[Rack #1, Node #3] initial_token: 
28356863910078205288614550619314017618,70892159775195513221536376548285044045,113427455640312821154458202477256070472,155962751505430129087380028406227096899
[Rack #2, Node #1] initial_token: 
4726143985013034214769091769885669603,47261439850130342147690917698856696030,89796735715247650080612743627827722457,132332031580364958013534569556798748884
[Rack #2, Node #2] initial_token: 
18904575940052136859076367079542678412,61439871805169444791998193008513704839,103975167670286752724920018937484731266,146510463535404060657841844866455757693
[Rack #2, Node #3] initial_token: 
33083007895091239503383642389199687221,75618303760208547436305468318170713648,118153599625325855369227294247141740075,160688895490443163302149120176112766502
[Rack #3, Node #1] initial_token: 
9452287970026068429538183539771339206,51987583835143376362460009468742365633,94522879700260684295381835397713392060,137058175565377992228303661326684418487
[Rack #3, Node #2] initial_token: 
23630719925065171073845458849428348015,66166015790182479006767284778399374442,108701311655299786939689110707370400869,151236607520417094872610936636341427296
[Rack #3, Node #3] initial_token: 
37809151880104273718152734159085356824,8037745221581651074560088056383251,122879743610338889583996386017027409678,165415039475456197516918211945998436105


Could you please comment on whether this is a good approach for 
allocating tokens when using vnodes.


Thank you.

Regards,
Kornel


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Generating evenly distributed tokens for vnodes

Re: Generating evenly distributed tokens for vnodes

Re: Generating evenly distributed tokens for vnodes

Re: Generating evenly distributed tokens for vnodes

Re: Generating evenly distributed tokens for vnodes

Generating evenly distributed tokens for vnodes

6 matches

Site Navigation

Mail list logo

Footer information