The way I've always thought about it is that -pr will make sure the information 
that specific node originates is consistent with its replicas.

So, we know that a node is responsible for a specific token range, and the next 
nodes in the ring will hold its replicas.  The -pr will make sure that a 
specific node's information is consistent to its replicas, but will not make 
sure a specific node has all the replicated information it can get from nodes 
previous to itself in the ring.

Without the -pr option, not only will the current node make sure its 
information and its replica's information is consistent, but it will also make 
sure that all the information that it is a replica for, is consistent.  

If you run regular repairs on all the nodes in your cluster, then -pr is 
sufficient.  Every node will run repair, and make sure its information is 
consistent with its replicas, eventually creating a fully consistent cluster.  
This is a quicker process, and will have less impact on your operations by 
essentially spreading out the pain.  

For instance, we run a 12 node cluster.  We run "nodetool repair -pr" on nodes 
that are opposite to each other, 4 nodes a day (2 nodes in the morning, 2 nodes 
in the evening).  With a grace period of 10 days, this allows us to run repairs 
twice a week on a specific node, and to occasionally skip repairs on specific 
nodes once a week.  

In this case, without -pr, a lot of extra work would be done.  In fact, with an 
RF of 3 (in our case), the time per repair would increase many fold.

Another way to thing about it... although likely not 100% technically correct..

A repair -pr will cause a push of a node's information to its replicas.  
Without the -pr, it will cause a push, and it will cause nodes it is a replica 
for to push their information as well.

-Mike

On Feb 28, 2013, at 9:39 PM, Hiller, Dean wrote:

> Isn't there more to it than that.  You really have nodes responsible for
> token ranges like so(using describe ring)
> 
> What we see is this from our describe ringŠ(1 to 6 are token ranges while
> A to F are servers)Š.
> A - 1, 2, 3
> B - 2, 3, 4
> C - 3, 4, 5
> D - 4, 5, 6
> E - 5, 6, 1
> F - 6, 1, 2
> 
> With -pr, only token range 1 is repaired I think, right?  2 and 3 are only
> repaired without the -pr option?  This means if I have a node that I just
> joined the cluster, I should "not" be using -pr as 2 and 3 on node A will
> not be up to date.  Using -pr is nice if I am going to repair every single
> node and is nice for the cron job that has to happen before
> gc_grace_seconds.  Am I wrong here?  Ie. -pr is really only good for use
> in the cron job as it would miss 2 and 3 above.  I could run the cron on
> just two servers but then my nodes are different which can be a hassle.
> 
> Please verify that is what you believe is what happens as well?
> 
> Thanks,
> Dean
> 
> On 2/28/13 5:58 PM, "Takenori Sato(Cloudian)" <ts...@cloudian.com> wrote:
> 
>> Hi,
>> 
>> Please note that I confirmed on v1.0.7.
>> 
>>> I mean a repair involves all three nodes and pushes and pulls data,
>> right?
>> 
>> Yes, but that's how -pr works. A repair without -pr does more.
>> 
>> For example, suppose you have a ring with RF=3 like this.
>> 
>> A - B - C - D - E - F
>> 
>> Then, a repair on A without -pr does for 3 ranges as follows:
>> [A, B, C]
>> [E, F, A]
>> [F, A, B]
>> 
>> Among them, the first one, [A, B, C] is the primary range of A.
>> 
>> So, with -pr, a repair runs only for:
>> [A, B, C]
>> 
>>> I could run nodetool repair on just 2 nodes(RF=3) instead of using
>> nodetool repair ­pr???
>> 
>> Yes.
>> 
>> You need to run two repairs on A and D.
>> 
>>> What is the advantage of ­pr then?
>> 
>> Whenever you want to minimize rapair impacts.
>> 
>> For example, suppose you got one node down for a while, and bring it
>> back to the cluster.
>> 
>> You need to run rapair without affecting the entire cluster. Then, -pr
>> is the option.
>> 
>> Thanks,
>> Takenori
>> 
>> (2013/03/01 7:39), Hiller, Dean wrote:
>>> Isn't it true if I have 6 nodes, I could run nodetool repair on just 2
>>> nodes(RF=3) instead of using nodetool repair ­pr???
>>> 
>>> What is the advantage of ­pr then?
>>> 
>>> I mean a repair involves all three nodes and pushes and pulls data,
>>> right?
>>> 
>>> Thanks,
>>> Dean
>> 
> 

Reply via email to