Re: [EXTERNAL] How to reduce vnodes without downtime

Sergio Sun, 02 Feb 2020 18:57:12 -0800

Thanks Anthony!

I will read more about it


Best,

Sergio



Il giorno dom 2 feb 2020 alle ore 18:36 Anthony Grasso <
anthony.gra...@gmail.com> ha scritto:

> Hi Sergio,
>
> There is a misunderstanding here. My post makes no recommendation for the
> value of num_tokens. Rather, it focuses on how to use
> the allocate_tokens_for_keyspace setting when creating a new cluster.
>
> Whilst a value of 4 is used for num_tokens in the post, it was chosen for
> demonstration purposes. Specifically it makes:
>
>    - the uneven token distribution in a small cluster very obvious,
>    - identifying the endpoints displayed in nodetool ring easy, and
>    - the initial_token setup less verbose and easier to follow.
>
> I will add an editorial note to the post with the above information
> so there is no confusion about why 4 tokens were used.
>
> I would only consider moving a cluster to 4 tokens if it is larger than
> 100 nodes. If you read through the paper that Erick mentioned, written
> by Joe Lynch & Josh Snyder, they show that the num_tokens impacts the
> availability of large scale clusters.
>
> If you are after more details about the trade-offs between different sized
> token values, please see the discussion on the dev mailing list: "[Discuss]
> num_tokens default in Cassandra 4.0
> <https://www.mail-archive.com/search?l=dev%40cassandra.apache.org&q=subject%3A%22%5C%5BDiscuss%5C%5D+num_tokens+default+in+Cassandra+4.0%22&o=oldest>
> ".
>
> Regards,
> Anthony
>
> On Sat, 1 Feb 2020 at 10:07, Sergio <lapostadiser...@gmail.com> wrote:
>
>>
>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>>  This
>> is the article with 4 token recommendations.
>> @Erick Ramirez. which is the dev thread for the default 32 tokens
>> recommendation?
>>
>> Thanks,
>> Sergio
>>
>> Il giorno ven 31 gen 2020 alle ore 14:49 Erick Ramirez <
>> flightc...@gmail.com> ha scritto:
>>
>>> There's an active discussion going on right now in a separate dev
>>> thread. The current "default recommendation" is 32 tokens. But there's a
>>> push for 4 in combination with allocate_tokens_for_keyspace from Jon
>>> Haddad & co (based on a paper from Joe Lynch & Josh Snyder).
>>>
>>> If you're satisfied with the results from your own testing, go with 4
>>> tokens. And that's the key -- you must test, test, TEST! Cheers!
>>>
>>> On Sat, Feb 1, 2020 at 5:17 AM Arvinder Dhillon <dhillona...@gmail.com>
>>> wrote:
>>>
>>>> What is recommended vnodes now? I read 8 in later cassandra 3.x
>>>> Is the new recommendation 4 now even in version 3.x (asking for 3.11)?
>>>> Thanks
>>>>
>>>> On Fri, Jan 31, 2020 at 9:49 AM Durity, Sean R <
>>>> sean_r_dur...@homedepot.com> wrote:
>>>>
>>>>> These are good clarifications and expansions.
>>>>>
>>>>>
>>>>>
>>>>> Sean Durity
>>>>>
>>>>>
>>>>>
>>>>> *From:* Anthony Grasso <anthony.gra...@gmail.com>
>>>>> *Sent:* Thursday, January 30, 2020 7:25 PM
>>>>> *To:* user <user@cassandra.apache.org>
>>>>> *Subject:* Re: [EXTERNAL] How to reduce vnodes without downtime
>>>>>
>>>>>
>>>>>
>>>>> Hi Maxim,
>>>>>
>>>>>
>>>>>
>>>>> Basically what Sean suggested is the way to do this without downtime.
>>>>>
>>>>>
>>>>>
>>>>> To clarify the, the *three* steps following the "Decommission each
>>>>> node in the DC you are working on" step should be applied to *only*
>>>>> the decommissioned nodes. So where it say "*all nodes*" or "*every
>>>>> node*" it applies to only the decommissioned nodes.
>>>>>
>>>>>
>>>>>
>>>>> In addition, the step that says "Wipe data on all the nodes", I would
>>>>> delete all files in the following directories on the decommissioned nodes.
>>>>>
>>>>>    - data (usually located in /var/lib/cassandra/data)
>>>>>    - commitlogs (usually located in /var/lib/cassandra/commitlogs)
>>>>>    - hints (usually located in /var/lib/casandra/hints)
>>>>>    - saved_caches (usually located in /var/lib/cassandra/saved_caches)
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Anthony
>>>>>
>>>>>
>>>>>
>>>>> On Fri, 31 Jan 2020 at 03:05, Durity, Sean R <
>>>>> sean_r_dur...@homedepot.com> wrote:
>>>>>
>>>>> Your procedure won’t work very well. On the first node, if you
>>>>> switched to 4, you would end up with only a tiny fraction of the data
>>>>> (because the other nodes would still be at 256). I updated a large cluster
>>>>> (over 150 nodes – 2 DCs) to smaller number of vnodes. The basic outline 
>>>>> was
>>>>> this:
>>>>>
>>>>>
>>>>>
>>>>>    - Stop all repairs
>>>>>    - Make sure the app is running against one DC only
>>>>>    - Change the replication settings on keyspaces to use only 1 DC
>>>>>    (basically cutting off the other DC)
>>>>>    - Decommission each node in the DC you are working on. Because the
>>>>>    replication setting are changed, no streaming occurs. But it releases 
>>>>> the
>>>>>    token assignments
>>>>>    - Wipe data on all the nodes
>>>>>    - Update configuration on every node to your new settings,
>>>>>    including auto_bootstrap = false
>>>>>    - Start all nodes. They will choose tokens, but not stream any data
>>>>>    - Update replication factor for all keyspaces to include the new DC
>>>>>    - I disabled binary on those nodes to prevent app connections
>>>>>    - Run nodetool reduild with -dc (other DC) on as many nodes as
>>>>>    your system can safely handle until they are all rebuilt.
>>>>>    - Re-enable binary (and app connections to the rebuilt DC)
>>>>>    - Turn on repairs
>>>>>    - Rest for a bit, then reverse the process for the remaining DCs
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Sean Durity – Staff Systems Engineer, Cassandra
>>>>>
>>>>>
>>>>>
>>>>> *From:* Maxim Parkachov <lazy.gop...@gmail.com>
>>>>> *Sent:* Thursday, January 30, 2020 10:05 AM
>>>>> *To:* user@cassandra.apache.org
>>>>> *Subject:* [EXTERNAL] How to reduce vnodes without downtime
>>>>>
>>>>>
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>>
>>>>>
>>>>> with discussion about reducing default vnodes in version 4.0 I would
>>>>> like to ask, what would be optimal procedure to perform reduction of 
>>>>> vnodes
>>>>> in existing 3.11.x cluster which was set up with default value 256. 
>>>>> Cluster
>>>>> has 2 DC with 5 nodes each and RF=3. There is one more restriction, I 
>>>>> could
>>>>> not add more servers, nor to create additional DC, everything is physical.
>>>>> This should be done without downtime.
>>>>>
>>>>>
>>>>>
>>>>> My idea for such procedure would be
>>>>>
>>>>>
>>>>>
>>>>> for each node:
>>>>>
>>>>> - decommission node
>>>>>
>>>>> - set auto_bootstrap to true and vnodes to 4
>>>>>
>>>>> - start and wait till node joins cluster
>>>>>
>>>>> - run cleanup on rest of nodes in cluster
>>>>>
>>>>> - run repair on whole cluster (not sure if needed after cleanup)
>>>>>
>>>>> - set auto_bootstrap to false
>>>>>
>>>>> repeat for each node
>>>>>
>>>>>
>>>>>
>>>>> rolling restart of cluster
>>>>>
>>>>> cluster repair
>>>>>
>>>>>
>>>>>
>>>>> Is this sounds right ? My concern is that after decommission, node
>>>>> will start on the same IP which could create some confusion.
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Maxim.
>>>>>
>>>>>
>>>>> ------------------------------
>>>>>
>>>>>
>>>>> The information in this Internet Email is confidential and may be
>>>>> legally privileged. It is intended solely for the addressee. Access to 
>>>>> this
>>>>> Email by anyone else is unauthorized. If you are not the intended
>>>>> recipient, any disclosure, copying, distribution or any action taken or
>>>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>>>> When addressed to our clients any opinions or advice contained in this
>>>>> Email are subject to the terms and conditions expressed in any applicable
>>>>> governing The Home Depot terms of business or client engagement letter. 
>>>>> The
>>>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>>>> content of this attachment and for any damages or losses arising from any
>>>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>>>> items of a destructive nature, which may be contained in this attachment
>>>>> and shall not be liable for direct, indirect, consequential or special
>>>>> damages in connection with this e-mail message or its attachment.
>>>>>
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> The information in this Internet Email is confidential and may be
>>>>> legally privileged. It is intended solely for the addressee. Access to 
>>>>> this
>>>>> Email by anyone else is unauthorized. If you are not the intended
>>>>> recipient, any disclosure, copying, distribution or any action taken or
>>>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>>>> When addressed to our clients any opinions or advice contained in this
>>>>> Email are subject to the terms and conditions expressed in any applicable
>>>>> governing The Home Depot terms of business or client engagement letter. 
>>>>> The
>>>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>>>> content of this attachment and for any damages or losses arising from any
>>>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>>>> items of a destructive nature, which may be contained in this attachment
>>>>> and shall not be liable for direct, indirect, consequential or special
>>>>> damages in connection with this e-mail message or its attachment.
>>>>>
>>>>

Re: [EXTERNAL] How to reduce vnodes without downtime

Reply via email to