Thanks Anthony! I will read more about it
Best, Sergio Il giorno dom 2 feb 2020 alle ore 18:36 Anthony Grasso < anthony.gra...@gmail.com> ha scritto: > Hi Sergio, > > There is a misunderstanding here. My post makes no recommendation for the > value of num_tokens. Rather, it focuses on how to use > the allocate_tokens_for_keyspace setting when creating a new cluster. > > Whilst a value of 4 is used for num_tokens in the post, it was chosen for > demonstration purposes. Specifically it makes: > > - the uneven token distribution in a small cluster very obvious, > - identifying the endpoints displayed in nodetool ring easy, and > - the initial_token setup less verbose and easier to follow. > > I will add an editorial note to the post with the above information > so there is no confusion about why 4 tokens were used. > > I would only consider moving a cluster to 4 tokens if it is larger than > 100 nodes. If you read through the paper that Erick mentioned, written > by Joe Lynch & Josh Snyder, they show that the num_tokens impacts the > availability of large scale clusters. > > If you are after more details about the trade-offs between different sized > token values, please see the discussion on the dev mailing list: "[Discuss] > num_tokens default in Cassandra 4.0 > <https://www.mail-archive.com/search?l=dev%40cassandra.apache.org&q=subject%3A%22%5C%5BDiscuss%5C%5D+num_tokens+default+in+Cassandra+4.0%22&o=oldest> > ". > > Regards, > Anthony > > On Sat, 1 Feb 2020 at 10:07, Sergio <lapostadiser...@gmail.com> wrote: > >> >> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html >> This >> is the article with 4 token recommendations. >> @Erick Ramirez. which is the dev thread for the default 32 tokens >> recommendation? >> >> Thanks, >> Sergio >> >> Il giorno ven 31 gen 2020 alle ore 14:49 Erick Ramirez < >> flightc...@gmail.com> ha scritto: >> >>> There's an active discussion going on right now in a separate dev >>> thread. The current "default recommendation" is 32 tokens. But there's a >>> push for 4 in combination with allocate_tokens_for_keyspace from Jon >>> Haddad & co (based on a paper from Joe Lynch & Josh Snyder). >>> >>> If you're satisfied with the results from your own testing, go with 4 >>> tokens. And that's the key -- you must test, test, TEST! Cheers! >>> >>> On Sat, Feb 1, 2020 at 5:17 AM Arvinder Dhillon <dhillona...@gmail.com> >>> wrote: >>> >>>> What is recommended vnodes now? I read 8 in later cassandra 3.x >>>> Is the new recommendation 4 now even in version 3.x (asking for 3.11)? >>>> Thanks >>>> >>>> On Fri, Jan 31, 2020 at 9:49 AM Durity, Sean R < >>>> sean_r_dur...@homedepot.com> wrote: >>>> >>>>> These are good clarifications and expansions. >>>>> >>>>> >>>>> >>>>> Sean Durity >>>>> >>>>> >>>>> >>>>> *From:* Anthony Grasso <anthony.gra...@gmail.com> >>>>> *Sent:* Thursday, January 30, 2020 7:25 PM >>>>> *To:* user <user@cassandra.apache.org> >>>>> *Subject:* Re: [EXTERNAL] How to reduce vnodes without downtime >>>>> >>>>> >>>>> >>>>> Hi Maxim, >>>>> >>>>> >>>>> >>>>> Basically what Sean suggested is the way to do this without downtime. >>>>> >>>>> >>>>> >>>>> To clarify the, the *three* steps following the "Decommission each >>>>> node in the DC you are working on" step should be applied to *only* >>>>> the decommissioned nodes. So where it say "*all nodes*" or "*every >>>>> node*" it applies to only the decommissioned nodes. >>>>> >>>>> >>>>> >>>>> In addition, the step that says "Wipe data on all the nodes", I would >>>>> delete all files in the following directories on the decommissioned nodes. >>>>> >>>>> - data (usually located in /var/lib/cassandra/data) >>>>> - commitlogs (usually located in /var/lib/cassandra/commitlogs) >>>>> - hints (usually located in /var/lib/casandra/hints) >>>>> - saved_caches (usually located in /var/lib/cassandra/saved_caches) >>>>> >>>>> >>>>> >>>>> Cheers, >>>>> >>>>> Anthony >>>>> >>>>> >>>>> >>>>> On Fri, 31 Jan 2020 at 03:05, Durity, Sean R < >>>>> sean_r_dur...@homedepot.com> wrote: >>>>> >>>>> Your procedure won’t work very well. On the first node, if you >>>>> switched to 4, you would end up with only a tiny fraction of the data >>>>> (because the other nodes would still be at 256). I updated a large cluster >>>>> (over 150 nodes – 2 DCs) to smaller number of vnodes. The basic outline >>>>> was >>>>> this: >>>>> >>>>> >>>>> >>>>> - Stop all repairs >>>>> - Make sure the app is running against one DC only >>>>> - Change the replication settings on keyspaces to use only 1 DC >>>>> (basically cutting off the other DC) >>>>> - Decommission each node in the DC you are working on. Because the >>>>> replication setting are changed, no streaming occurs. But it releases >>>>> the >>>>> token assignments >>>>> - Wipe data on all the nodes >>>>> - Update configuration on every node to your new settings, >>>>> including auto_bootstrap = false >>>>> - Start all nodes. They will choose tokens, but not stream any data >>>>> - Update replication factor for all keyspaces to include the new DC >>>>> - I disabled binary on those nodes to prevent app connections >>>>> - Run nodetool reduild with -dc (other DC) on as many nodes as >>>>> your system can safely handle until they are all rebuilt. >>>>> - Re-enable binary (and app connections to the rebuilt DC) >>>>> - Turn on repairs >>>>> - Rest for a bit, then reverse the process for the remaining DCs >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Sean Durity – Staff Systems Engineer, Cassandra >>>>> >>>>> >>>>> >>>>> *From:* Maxim Parkachov <lazy.gop...@gmail.com> >>>>> *Sent:* Thursday, January 30, 2020 10:05 AM >>>>> *To:* user@cassandra.apache.org >>>>> *Subject:* [EXTERNAL] How to reduce vnodes without downtime >>>>> >>>>> >>>>> >>>>> Hi everyone, >>>>> >>>>> >>>>> >>>>> with discussion about reducing default vnodes in version 4.0 I would >>>>> like to ask, what would be optimal procedure to perform reduction of >>>>> vnodes >>>>> in existing 3.11.x cluster which was set up with default value 256. >>>>> Cluster >>>>> has 2 DC with 5 nodes each and RF=3. There is one more restriction, I >>>>> could >>>>> not add more servers, nor to create additional DC, everything is physical. >>>>> This should be done without downtime. >>>>> >>>>> >>>>> >>>>> My idea for such procedure would be >>>>> >>>>> >>>>> >>>>> for each node: >>>>> >>>>> - decommission node >>>>> >>>>> - set auto_bootstrap to true and vnodes to 4 >>>>> >>>>> - start and wait till node joins cluster >>>>> >>>>> - run cleanup on rest of nodes in cluster >>>>> >>>>> - run repair on whole cluster (not sure if needed after cleanup) >>>>> >>>>> - set auto_bootstrap to false >>>>> >>>>> repeat for each node >>>>> >>>>> >>>>> >>>>> rolling restart of cluster >>>>> >>>>> cluster repair >>>>> >>>>> >>>>> >>>>> Is this sounds right ? My concern is that after decommission, node >>>>> will start on the same IP which could create some confusion. >>>>> >>>>> >>>>> >>>>> Regards, >>>>> >>>>> Maxim. >>>>> >>>>> >>>>> ------------------------------ >>>>> >>>>> >>>>> The information in this Internet Email is confidential and may be >>>>> legally privileged. It is intended solely for the addressee. Access to >>>>> this >>>>> Email by anyone else is unauthorized. If you are not the intended >>>>> recipient, any disclosure, copying, distribution or any action taken or >>>>> omitted to be taken in reliance on it, is prohibited and may be unlawful. >>>>> When addressed to our clients any opinions or advice contained in this >>>>> Email are subject to the terms and conditions expressed in any applicable >>>>> governing The Home Depot terms of business or client engagement letter. >>>>> The >>>>> Home Depot disclaims all responsibility and liability for the accuracy and >>>>> content of this attachment and for any damages or losses arising from any >>>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other >>>>> items of a destructive nature, which may be contained in this attachment >>>>> and shall not be liable for direct, indirect, consequential or special >>>>> damages in connection with this e-mail message or its attachment. >>>>> >>>>> >>>>> ------------------------------ >>>>> >>>>> The information in this Internet Email is confidential and may be >>>>> legally privileged. It is intended solely for the addressee. Access to >>>>> this >>>>> Email by anyone else is unauthorized. If you are not the intended >>>>> recipient, any disclosure, copying, distribution or any action taken or >>>>> omitted to be taken in reliance on it, is prohibited and may be unlawful. >>>>> When addressed to our clients any opinions or advice contained in this >>>>> Email are subject to the terms and conditions expressed in any applicable >>>>> governing The Home Depot terms of business or client engagement letter. >>>>> The >>>>> Home Depot disclaims all responsibility and liability for the accuracy and >>>>> content of this attachment and for any damages or losses arising from any >>>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other >>>>> items of a destructive nature, which may be contained in this attachment >>>>> and shall not be liable for direct, indirect, consequential or special >>>>> damages in connection with this e-mail message or its attachment. >>>>> >>>>