Re: Question related to nodetool repair options

2021-09-07 Thread Deepak Sharma
Thanks Erick! It is clear now.

On Tue, Sep 7, 2021 at 4:07 PM Erick Ramirez 
wrote:

> No, I'm just saying that [-pr] is the same as [-pr -full], NOT the same as
> just [-full] on its own. Primary range repairs are not compatible with
> incremental repairs so by definition, -pr is a [-pr -full] repair. I think
> you're confusing the concept of a full repair vs incremental. This document
> might help you understand the concepts --
> https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsRepairNodesManualRepair.html.
> Cheers!
>
>>


Re: Question related to nodetool repair options

2021-09-07 Thread Erick Ramirez
No, I'm just saying that [-pr] is the same as [-pr -full], NOT the same as
just [-full] on its own. Primary range repairs are not compatible with
incremental repairs so by definition, -pr is a [-pr -full] repair. I think
you're confusing the concept of a full repair vs incremental. This document
might help you understand the concepts --
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsRepairNodesManualRepair.html.
Cheers!

>


Re: Question related to nodetool repair options

2021-09-07 Thread Deepak Sharma
Thanks Erick for the response. So in option 3, -pr is not taken into
consideration which essentially means option 3 is the same as option 1
(which is the full repair).

Right, just want to be sure?

Best,
Deepak

On Tue, Sep 7, 2021 at 3:41 PM Erick Ramirez 
wrote:

>
>1. Will perform a full repair vs incremental which is the default in
>some later versions.
>2. As you said, will only repair the token range(s) on the node for
>which it is a primary owner.
>3. The -full flag with -pr is redundant -- primary range repairs are
>always done as a full repair because it is not compatible with incremental
>repairs,, i.e. -pr doesn't care that an SSTable is already marked as
>repaired.
>
>


Re: Question related to nodetool repair options

2021-09-07 Thread Erick Ramirez
   1. Will perform a full repair vs incremental which is the default in
   some later versions.
   2. As you said, will only repair the token range(s) on the node for
   which it is a primary owner.
   3. The -full flag with -pr is redundant -- primary range repairs are
   always done as a full repair because it is not compatible with incremental
   repairs,, i.e. -pr doesn't care that an SSTable is already marked as
   repaired.


Question related to nodetool repair options

2021-09-07 Thread Deepak Sharma
Hi There,

We are on Cassandra 3.0.11 and I want to understand what is the
difference between following two commands

1. nodetool repair -full
2. nodetool repair -pr
3. nodetool repair -full -pr

As per my understanding 1. will do the full repair across all keyspaces. 2.
with -pr, restricts repair to the 'primary' token ranges of the node being
repaired. With 3. I am not sure what we are trying to achieve.

Thanks,
Deepak


Re: nodetool repair options

2015-01-23 Thread Robert Coli
On Fri, Jan 23, 2015 at 10:03 AM, Robert Wille rwi...@fold3.com wrote:

 The docs say Use -pr to repair only the first range returned by the
 partitioner”. What does this mean? Why would I only want to repair the
 first range?


If you're repairing the whole cluster, repairing only the primary range on
each node avoids avoiding once per replication factor.


 What are the tradeoffs of a parallel versus serial repair?


Parallel repair affects all replicas simultaneously and can thereby degrade
latency for that replica set. Serial repair doesn't, but is serial and
intensely slower. Serial repair is probably not usable at all with RF5 or
so, unless you set an extremely long gc_grace_seconds.


 What are the recommended options for regular, periodic repair?


(Snapshot/incremental repair, default IIRC in newer Cassandra, changes many
of these assumptions. I refer to old-style nodetool repair with my
statements.)

The canonical response is repair the entire cluster with -pr once per
gc_grace_seconds.

Regarding frequent repair... consider your RF, CL and whether you actually
care about consistency and durability for any given colunfamily. If you
never do DELETE-like-operations (in CQL, this includes things other than
DELETE statements) in the CF, probably don't repair it just for consistency
purposes.

Then, consider how long you can tolerate DELETEd data sticking around. If
you can tolerate it because you don't DELETE much data, set
gc_grace_seconds to at least 34 days. With 34 days, you can begin a repair
on the first of the month and have between 3 and 7 days for it to complete.
You repair for up to a few days in order to repair a month's data. With
shorter repair cycles, you pay the relatively high cost of repair
repeatedly.

Last, consider your Cassandra version. Newer versions have had significant
focus on streaming and repair stability and performance. Upgrade to the
HEAD of 2.0.x if possible.

There's this thing I jokingly call the Coli Conjecture, which says that if
you're in a good case for Cassandra you probably don't actually don't care
about consistency or durability, even if you think you do. This comes from
years of observing consistency edge cases in Cassandra and noticing that
even very few people who detected them and reported them seemed to
experience very negative results from the perspective of their application.
I think it is an interesting observation and a different mindset for many
people coming from the non-distributed, normalized, relational world.

=Rob