I'm on mobile now so I might be mistaken, but I don't think nodetool move works with multiple tokens
On Fri, May 29, 2020, 1:48 PM Kornel Pal <kornel...@gmail.com> wrote: > Hi Anthony, > > Thank you very much for looking into using the script for initial token > generation and for providing multiple detailed methods of expanding the > cluster. > > This helps a lot, indeed. > > Regards, > Kornel > Anthony Grasso wrote: > > Hi Kornel, > > Great use of the script for generating initial tokens! I agree that you > can achieve an optimal token distribution in a cluster using such a method. > > One thing to think about is the process for expanding the size of the > cluster in this case. For example consider the scenario where you wanted to > insert a single new node into the cluster. To do this you would need to > calculate what the new token ranges should be for the nodes including the > new node. You would then need to reassign existing tokens to other nodes > using 'nodetool move'. You would likely need to call this command a few > times to do a few movements in order to achieve the newly calculated token > assignments. Once the "gap" in the token ranges has been created, you would > then update the initial_token property for the existing nodes in the > cluster. Finally, you could then insert the new node with the assigned > tokens. > > While the above process could be used to maintain an optimal token > distribution in a cluster, it does increase operational overhead. This is > where allocate_tokens_for_keyspace and > allocate_tokens_for_local_replication_factor (4.0 only) play a critical > role. They save the operational overhead when changing the size of the > cluster. In addition, from my experience they do a pretty good job at > keeping the token ranges evenly distributed when expanding the cluster. > Even in the case where a low number for num_tokens is used. If expanding > the cluster size is required during an emergency, using the > allocate_token_* setting would be the most simple and reliable way to > quickly insert a node while maintaining reasonable token distribution. > > The only other way to expand the cluster and maintain even token > distribution without using an allocate_token_* setting, is to double the > size of the cluster each time. Obviously this has its own draw backs in > terms of increase costs to both money and time compared to inserting a > single node. > > Hope this helps. > > Kind regards, > Anthony > > On Thu, 28 May 2020 at 04:52, Kornel Pal <kornel...@gmail.com> wrote: > >> As I understand, the previous discussion is about using >> allocate_tokens_for_keyspace for allocating tokens for most of the >> nodes. On the other hand, I am proposing to generate all the tokens for >> all the nodes using a Python script. >> >> This seems to result in perfectly even token ownership distribution >> across all the nodes for all possible replication factors, thus being an >> improvement over using allocate_tokens_for_keyspace. >> >> Elliott Sims wrote: >> > There's also a slightly older mailing list discussion on this subject >> > that goes into detail on this sort of strategy: >> > https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html >> > >> > I've been approximately following it, repeating steps 3-6 for the first >> > host in each "rack(replica, since I have 3 racks and RF=3) then 8-10 >> for >> > the remaining hosts in the new datacenter. So far, so good (sample >> size >> > of 1) but it's a pretty painstaking process >> > >> > This should get a lot simpler with Cassandra 4+'s >> > "allocate_tokens_for_local_replication_factor" option, which will >> > default to 3. >> > >> > On Wed, May 27, 2020 at 4:34 AM Kornel Pal <kornel...@gmail.com >> > <mailto:kornel...@gmail.com>> wrote: >> > >> > Hi, >> > >> > Generating ideal tokens for single-token datacenters is well >> understood >> > and documented, but there is much less information available on >> > generating tokens with even ownership distribution when using >> vnodes. >> > The best description I could find on token generation for vnodes is >> > >> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html >> > >> > While allocate_tokens_for_keyspace results in much more even >> ownership >> > distribution than random allocation, and does a great job at >> balancing >> > ownership when adding new nodes, using it for creating a new >> datacenter >> > results in less than ideal ownership distribution. >> > >> > After some experimentation, I found that it is possible to generate >> all >> > the tokens for a new datacenter with an extended version of the >> Python >> > script presented in the above blog post. Using these tokens seem to >> > result in perfectly even ownership distribution with various >> > token/node/rack configurations for all possible replication factors. >> > >> > Murmur3Partitioner: >> > >>> datacenter_offset = 0 >> > >>> num_tokens = 4 >> > >>> num_racks = 3 >> > >>> num_nodes = 3 >> > >>> print "\n".join(['[Rack #{}, Node #{}] initial_token: >> > {}'.format(r >> > + 1, n + 1, ','.join([str(((2**64 / (num_tokens * num_nodes * >> > num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) - >> > 2**63 + >> > datacenter_offset) for t in range(num_tokens)])) for r in >> > range(num_racks) for n in range(num_nodes)]) >> > [Rack #1, Node #1] initial_token: >> > -9223372036854775808,-4611686018427387908,-8,4611686018427387892 >> > [Rack #1, Node #2] initial_token: >> > >> >> -7686143364045646508,-3074457345618258608,1537228672809129292,6148914691236517192 >> > [Rack #1, Node #3] initial_token: >> > >> >> -6148914691236517208,-1537228672809129308,3074457345618258592,7686143364045646492 >> > [Rack #2, Node #1] initial_token: >> > >> >> -8710962479251732708,-4099276460824344808,512409557603043092,5124095576030430992 >> > [Rack #2, Node #2] initial_token: >> > >> >> -7173733806442603408,-2562047788015215508,2049638230412172392,6661324248839560292 >> > [Rack #2, Node #3] initial_token: >> > >> >> -5636505133633474108,-1024819115206086208,3586866903221301692,8198552921648689592 >> > [Rack #3, Node #1] initial_token: >> > >> >> -8198552921648689608,-3586866903221301708,1024819115206086192,5636505133633474092 >> > [Rack #3, Node #2] initial_token: >> > >> >> -6661324248839560308,-2049638230412172408,2562047788015215492,7173733806442603392 >> > [Rack #3, Node #3] initial_token: >> > >> >> -5124095576030431008,-512409557603043108,4099276460824344792,8710962479251732692 >> > >> > RandomPartitioner: >> > >>> datacenter_offset = 0 >> > >>> num_tokens = 4 >> > >>> num_racks = 3 >> > >>> num_nodes = 3 >> > >>> print "\n".join(['[Rack #{}, Node #{}] initial_token: >> > {}'.format(r >> > + 1, n + 1, ','.join([str(((2**127 / (num_tokens * num_nodes * >> > num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) + >> > datacenter_offset) for t in range(num_tokens)])) for r in >> > range(num_racks) for n in range(num_nodes)]) >> > [Rack #1, Node #1] initial_token: >> > >> >> 0,42535295865117307932921825928971026427,85070591730234615865843651857942052854,127605887595351923798765477786913079281 >> > [Rack #1, Node #2] initial_token: >> > >> >> 14178431955039102644307275309657008809,56713727820156410577229101238628035236,99249023685273718510150927167599061663,141784319550391026443072753096570088090 >> > [Rack #1, Node #3] initial_token: >> > >> >> 28356863910078205288614550619314017618,70892159775195513221536376548285044045,113427455640312821154458202477256070472,155962751505430129087380028406227096899 >> > [Rack #2, Node #1] initial_token: >> > >> >> 4726143985013034214769091769885669603,47261439850130342147690917698856696030,89796735715247650080612743627827722457,132332031580364958013534569556798748884 >> > [Rack #2, Node #2] initial_token: >> > >> >> 18904575940052136859076367079542678412,61439871805169444791998193008513704839,103975167670286752724920018937484731266,146510463535404060657841844866455757693 >> > [Rack #2, Node #3] initial_token: >> > >> >> 33083007895091239503383642389199687221,75618303760208547436305468318170713648,118153599625325855369227294247141740075,160688895490443163302149120176112766502 >> > [Rack #3, Node #1] initial_token: >> > >> >> 9452287970026068429538183539771339206,51987583835143376362460009468742365633,94522879700260684295381835397713392060,137058175565377992228303661326684418487 >> > [Rack #3, Node #2] initial_token: >> > >> >> 23630719925065171073845458849428348015,66166015790182479006767284778399374442,108701311655299786939689110707370400869,151236607520417094872610936636341427296 >> > [Rack #3, Node #3] initial_token: >> > >> >> 37809151880104273718152734159085356824,80344447745221581651074560088056383251,122879743610338889583996386017027409678,165415039475456197516918211945998436105 >> > >> > Could you please comment on whether this is a good approach for >> > allocating tokens when using vnodes. >> > >> > Thank you. >> > >> > Regards, >> > Kornel >> > >> > >> > >> --------------------------------------------------------------------- >> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> > <mailto:user-unsubscr...@cassandra.apache.org> >> > For additional commands, e-mail: user-h...@cassandra.apache.org >> > <mailto:user-h...@cassandra.apache.org> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >>