Re: Running Node Repair After Changing RF or Replication Strategy for a Keyspace

Jon Haddad Fri, 28 Jun 2019 16:52:46 -0700

Yep - not to mention the increased complexity and overhead of going from
ONE to QUORUM, or the increased cost of QUORUM in RF=5 vs RF=3.


If you're in a cloud provider, I've found you're almost always better off
adding a new DC with a higher RF, assuming you're on NTS like Jeff
mentioned.

On Fri, Jun 28, 2019 at 2:29 PM Jeff Jirsa <jji...@gmail.com> wrote:

> For just changing RF:
>
> You only need to repair the full token range - how you do that is up to
> you. Running `repair -pr -full` on each node will do that. Running `repair
> -full` will do it multiple times, so it's more work, but technically
> correct.The caveat that few people actually appreciate about changing
> replication factors (# of copies per DC) is that you often have to run
> repair after each increment - going from 3 -> 5 means 3 -> 4, repair, 4 ->
> 5 - just going 3 -> 5 will violate consistency guarantees, and is
> technically unsafe.
>
> For changing replication strategy:
>
> Changing replication strategy is nontrivial - going from Simple to NTS is
> often easy to do in a truly eventual consistency use case, but becomes much
> harder if you're:
> - using multiple DCs or
> - vnodes + racks or
> - if you must do it without violating consistency.
>
> It turns out if you're not using multiple DCs or racks, then
> simplestrategy is fine. But if you are using multiple DCs/racks, then
> changing is very very hard. So usually by the time you're asking how to do
> this, you're in a very bad position.
>
> Do you have simple strategy and multiple DCs?
> Are you using vnodes and racks?
>
> I'd be incredibly skeptical about any blog that tried to give concrete
> steps on how to do this - the steps are probably right 80% of the time, but
> horribly wrong 20% of the time, especially if there's not a paragraph or
> two about racks along the way.
>
>
>
>
>
> On Fri, Jun 28, 2019 at 7:52 AM Fd Habash <fmhab...@gmail.com> wrote:
>
>> Hi all …
>>
>>
>>
>> The datastax & apache docs are clear: run ‘nodetool repair’ after you
>> alter a keyspace to change its RF or RS.
>>
>>
>>
>> However, the details are all over the place as what type of repair and on
>> what nodes it needs to run. None of the above doc authorities are clear and
>> what you find on the internet is quite contradictory.
>>
>>
>>
>> For example, this IBM doc
>> <https://www.ibm.com/support/knowledgecenter/en/SS3JSW_5.2.0/com.ibm.help.gdha_administering.doc/com.ibm.help.gdha_administering.doc/gdha_changing_replication_factor.html>
>> suggest to run both the ‘alter keyspace’ and repair on EACH node affected
>> or on ‘each node you need to change the RF on’.  Others
>> <https://myadventuresincoding.wordpress.com/2019/01/29/cassandra-switching-from-simplestrategy-to-networktopologystrategy/>,
>> suggest to run ‘repair -pr’.
>>
>>
>>
>> On a cluster of 1 DC and three racks, this is how I understand it ….
>>
>>    1. Run the ‘alter keyspace’ on a SINGLE node.
>>    2. As for repairing the altered keyspac, I assume there are two
>>    options …
>>       1. Run ‘repair -full [key_space]’ on all nodes in all racks
>>       2. Run ‘repair -pr -full [keyspace] on all nodes in all racks
>>
>>
>>
>> Sounds correct?
>>
>>
>>
>> ----------------
>> Thank you
>>
>>
>>
>

Re: Running Node Repair After Changing RF or Replication Strategy for a Keyspace

Reply via email to