Re: nodetool repair -pr

2018-06-08 Thread Arvinder Dhillon
It depends on your data model. -pr only repair primary range. So if there
is a keyspace with replication 'DC2:3', and you run repair -pr only on all
nodes of DC1, it is not going to repair token ranges corsponding to DC2. So
you will have to run on each node.

-Arvinder

On Fri, Jun 8, 2018, 8:42 PM Igor Zubchenok  wrote:

> According docs at
> http://cassandra.apache.org/doc/latest/operating/repair.html?highlight=single
>
>
> *The -pr flag will only repair the “primary” ranges on a node, so you can
> repair your entire cluster by running nodetool repair -pr on each node in
> a single datacenter.*
> But I saw many places, where it is noted that I should run it at ALL data
> centers.
>
> Looking for a qualified answer.
>
>
> On Fri, 8 Jun 2018 at 18:08 Igor Zubchenok  wrote:
>
>> I want to repair all nodes at all data centers.
>>
>> Example:
>> DC1
>>  nodeA
>>  nodeB
>>  nodeC
>> DC2
>>  node D
>>  node E
>>  node F
>>
>> If I run `nodetool repair -pr` at nodeA nodeB and nodeC, will all ranges
>> be repaired?
>>
>>
>> On Fri, 8 Jun 2018 at 17:57 Rahul Singh 
>> wrote:
>>
>>> From DS dox : "Do not use -pr with this option to repair only a local
>>> data center."
>>> On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>>>
>>>
>>> *nodetool repair -pr*
>>>
>>>


Re: nodetool repair -pr

2018-06-08 Thread Igor Zubchenok
According docs at
http://cassandra.apache.org/doc/latest/operating/repair.html?highlight=single


*The -pr flag will only repair the “primary” ranges on a node, so you can
repair your entire cluster by running nodetool repair -pr on each node in
a single datacenter.*
But I saw many places, where it is noted that I should run it at ALL data
centers.

Looking for a qualified answer.


On Fri, 8 Jun 2018 at 18:08 Igor Zubchenok  wrote:

> I want to repair all nodes at all data centers.
>
> Example:
> DC1
>  nodeA
>  nodeB
>  nodeC
> DC2
>  node D
>  node E
>  node F
>
> If I run `nodetool repair -pr` at nodeA nodeB and nodeC, will all ranges
> be repaired?
>
>
> On Fri, 8 Jun 2018 at 17:57 Rahul Singh 
> wrote:
>
>> From DS dox : "Do not use -pr with this option to repair only a local
>> data center."
>> On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>>
>>
>> *nodetool repair -pr*
>>
>>


Re: nodetool repair -pr

2018-06-08 Thread Igor Zubchenok
I want to repair all nodes at all data centers.

Example:
DC1
 nodeA
 nodeB
 nodeC
DC2
 node D
 node E
 node F

If I run `nodetool repair -pr` at nodeA nodeB and nodeC, will all ranges be
repaired?


On Fri, 8 Jun 2018 at 17:57 Rahul Singh 
wrote:

> From DS dox : "Do not use -pr with this option to repair only a local
> data center."
> On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>
>
> *nodetool repair -pr*
>
> --
Regards,
Igor Zubchenok

CTO at Multi Brains LLC
Founder of taxistartup.com saytaxi.com chauffy.com
Skype: igor.zubchenok


Re: nodetool repair -pr

2018-06-08 Thread Rahul Singh
>From DS dox : "Do not use -pr with this option to repair only a local data 
>center."
On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>
> nodetool repair -pr


nodetool repair -pr

2018-06-08 Thread Igor Zubchenok
Hi!

I want to repair all nodes in all datacenters.
Should I run *nodetool repair -pr* at all nodes of a SINGLE datacenter or
at all nodes of ALL datacenters?
-- 
Regards,
Igor Zubchenok

CTO at Multi Brains LLC
Founder of taxistartup.com saytaxi.com chauffy.com
Skype: igor.zubchenok


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-23 Thread Fred Habash
 .

On Feb 21, 2018 1:29 PM, "Fred Habash" <fmhab...@gmail.com> wrote:

> One node at a time
>
> On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com>
> wrote:
>
>> What is your replication factor?
>> Single datacenter, three availability zones, is that right?
>> You removed one node at a time or three at once?
>>
>> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
>>
>>> We have had a 15 node cluster across three zones and cluster repairs
>>> using ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk
>>> the cluster to 12. Since then, same repair job has taken up to 12 hours to
>>> finish and most times, it never does.
>>>
>>>
>>>
>>> More importantly, at some point during the repair cycle, we see read
>>> latencies jumping to 1-2 seconds and applications immediately notice the
>>> impact.
>>>
>>>
>>>
>>> stream_throughput_outbound_megabits_per_sec is set at 200 and
>>> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is
>>> around ~500GB at 44% usage.
>>>
>>>
>>>
>>> When shrinking the cluster, the ‘nodetool decommision’ was eventless. It
>>> completed successfully with no issues.
>>>
>>>
>>>
>>> What could possibly cause repairs to cause this impact following cluster
>>> downsizing? Taking three nodes out does not seem compatible with such a
>>> drastic effect on repair and read latency.
>>>
>>>
>>>
>>> Any expert insights will be appreciated.
>>>
>>> 
>>> Thank you
>>>
>>>
>>>
>>
>>


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead Latency After Shrinking Cluster

2018-02-22 Thread Carl Mueller
Your partition sizes aren't ridiculous... kinda big cells if there are 4
cells and 12 MB partitions, but still I don't think that is ludicrous.

Whelp, I'm out of ideas from my "pay grade". Honestly, with AZ/racks you
should have theoretically might have been able to take the nodes off
simultaneously, but (Disclaimer) I've never done that.

?Rolling Restart? <-- definitely indicates I have no ideas :-)

On Thu, Feb 22, 2018 at 8:15 AM, Fd Habash <fmhab...@gmail.com> wrote:

> One more observation …
>
>
>
> When we compare read latencies between non-prod (where nodes were removed)
> to prod clusters, even though the node load as measure by size of /data dir
> is similar, yet the read latencies are 5 times slower in the downsized
> non-prod cluster.
>
>
>
> The only difference we see is that prod reads from 4 sstables whereas
> non-prod reads from 5 as cfhistograms.
>
>
>
> Non-prod /data size
>
> -
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  454G  432G  52% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  439G  446G  50% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  368G  518G  42% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  431G  455G  49% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  463G  423G  53% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  406G  479G  46% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  419G  466G  48% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
>
>
> Prod /data size
>
> 
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  352G  534G  40% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  423G  462G  48% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  431G  454G  49% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  442G  443G  50% /data
>
> Filesystem  Size  Used Avail Use% Mounted on
>
> /dev/nvme0n1885G  454G  431G  52% /data
>
>
>
>
>
> Cfhistograms: comparing prod to non-prod
>
> -
>
>
>
> Non-prod
>
> --
>
> 08:21:38Percentile  SSTables Write Latency  Read
> LatencyPartition SizeCell Count
>
> 08:21:38  (micros)
> (micros)   (bytes)
>
> 08:21:3850% 1.00 24.60
> 4055.27 11864 4
>
> 08:21:3875% 2.00 35.43
> 14530.76 17084 4
>
> 08:21:3895% 4.00126.93
> 89970.66 35425 4
>
> 08:21:3898% 5.00219.34
> 155469.30 73457 4
>
> 08:21:3899% 5.00219.34
>186563.16105778 4
>
> 08:21:38Min 0.00  5.72
> 17.0987 3
>
> 08:21:38Max 7.00  20924.30
> 1386179.89  14530764 4
>
>
>
> Prod
>
> ---
>
> 07:41:42Percentile  SSTables Write Latency  Read
> LatencyPartition SizeCell Count
>
> 07:41:42  (micros)
> (micros)   (bytes)
>
> 07:41:4250% 1.00 24.60
> 2346.80 11864 4
>
> 07:41:4275% 2.00 29.52
> 4866.32 17084 4
>
> 07:41:4295% 3.00 73.46
> 14530.76 29521 4
>
> 07:41:4298% 4.00182.79
> 25109.16 61214 4
>
> 07:41:4299% 4.00182.79
> 36157.19     88148     4
>
> 07:41:42Min 0.00  9.89
> 20.5087 0
>
> 07:41:42Max 5.00219.34
> 155469.30  12108970 4
>
>
>
>
>
> 
> Thank you
>
>
>
> *From: *Fd Habash <fmhab...@gmail.com>
> *Sent: *Thursday, February 22, 2018 9:00 AM
> *To: *user@cassandra.apache.org
>

RE: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead Latency After Shrinking Cluster

2018-02-22 Thread Fd Habash
One more observation …

When we compare read latencies between non-prod (where nodes were removed) to 
prod clusters, even though the node load as measure by size of /data dir is 
similar, yet the read latencies are 5 times slower in the downsized non-prod 
cluster.

The only difference we see is that prod reads from 4 sstables whereas non-prod 
reads from 5 as cfhistograms. 

Non-prod /data size
-
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  454G  432G  52% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  439G  446G  50% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  368G  518G  42% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  431G  455G  49% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  463G  423G  53% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  406G  479G  46% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  419G  466G  48% /data
Filesystem  Size  Used Avail Use% Mounted on

Prod /data size

Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  352G  534G  40% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  423G  462G  48% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  431G  454G  49% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  442G  443G  50% /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/nvme0n1885G  454G  431G  52% /data


Cfhistograms: comparing prod to non-prod
-

Non-prod
--
08:21:38Percentile  SSTables Write Latency  Read 
LatencyPartition SizeCell Count
08:21:38  (micros)  
(micros)   (bytes)  
08:21:3850% 1.00 24.60   
4055.27 11864 4
08:21:3875% 2.00 35.43  
14530.76 17084 4
08:21:3895% 4.00126.93  
89970.66 35425 4
08:21:3898% 5.00219.34 
155469.30 73457 4
08:21:3899% 5.00219.34 
186563.16105778 4
08:21:38Min 0.00  5.72 
17.0987 3
08:21:38Max 7.00  20924.30
1386179.89  14530764 4

Prod
--- 
07:41:42Percentile  SSTables Write Latency  Read 
LatencyPartition SizeCell Count
07:41:42  (micros)  
(micros)   (bytes)  
07:41:4250% 1.00 24.60   
2346.80 11864 4
07:41:4275% 2.00 29.52   
4866.32 17084 4
07:41:4295% 3.00 73.46  
14530.76 29521 4
07:41:4298% 4.00182.79  
25109.16 61214 4
07:41:4299% 4.00182.79  
36157.19 88148 4
07:41:42Min 0.00  9.89 
20.5087 0
07:41:42Max 5.00219.34 
155469.30  12108970 4



Thank you

From: Fd Habash
Sent: Thursday, February 22, 2018 9:00 AM
To: user@cassandra.apache.org
Subject: RE: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead 
Latency After Shrinking Cluster


“ data was allowed to fully rebalance/repair/drain before the next node was 
taken off?”
--
Judging by the messages, the decomm was healthy. As an example 

  StorageService.java:3425 - Announcing that I have left the ring for 3ms   
…
INFO  [RMI TCP Connection(4)-127.0.0.1] 2016-01-07 06:00:52,662 
StorageService.java:1191 – DECOMMISSIONED

I do not believe repairs were run after each node removal. I’ll double-check. 

I’m not sure what you mean by ‘rebalance’? How do you check if a node is 
balanced? Load/size of data dir? 

As for the drain, there was no need to drain and I believe it is not something 
you do as part of decomm’ing a node. 

did you take 1 off per rack/AZ?
--
We removed 3 nodes, one from each AZ in sequence

These are some

RE: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase inRead Latency After Shrinking Cluster

2018-02-22 Thread Fd Habash

“ data was allowed to fully rebalance/repair/drain before the next node was 
taken off?”
--
Judging by the messages, the decomm was healthy. As an example 

  StorageService.java:3425 - Announcing that I have left the ring for 3ms   
 
…
INFO  [RMI TCP Connection(4)-127.0.0.1] 2016-01-07 06:00:52,662 
StorageService.java:1191 – DECOMMISSIONED

I do not believe repairs were run after each node removal. I’ll double-check. 

I’m not sure what you mean by ‘rebalance’? How do you check if a node is 
balanced? Load/size of data dir? 

As for the drain, there was no need to drain and I believe it is not something 
you do as part of decomm’ing a node. 

did you take 1 off per rack/AZ?
--
We removed 3 nodes, one from each AZ in sequence

These are some of the cfhistogram metrics. Read latencies are high after the 
removal of the nodes
--
You can see reads of 186ms are at the 99th% from 5 sstables. There are awfully 
high numbers given that these metrics measure C* storage layer read 
performance. 

Does this mean removing the nodes undersized the cluster? 

key_space_01/cf_01 histograms
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)  

50% 1.00 24.60   4055.27 11864  
   4
75% 2.00 35.43  14530.76 17084  
   4
95% 4.00126.93  89970.66 35425  
   4
98% 5.00219.34 155469.30 73457  
   4
99% 5.00219.34 186563.16105778  
   4
Min 0.00  5.72 17.0987  
   3
Max 7.00  20924.301386179.89  14530764  
   4

key_space_01/cf_01 histograms
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)  

50% 1.00 29.52   4055.27 11864  
   4
75% 2.00 42.51  10090.81 17084  
   4
95% 4.00152.32  52066.35 35425  
   4
98% 4.00219.34  89970.66 73457  
   4
99% 5.00219.34 155469.30 88148  
   4
Min 0.00  9.89 24.6087  
   0
Max 6.00   1955.67 557074.61  14530764  
   4


Thank you

From: Carl Mueller
Sent: Wednesday, February 21, 2018 4:33 PM
To: user@cassandra.apache.org
Subject: Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase inRead 
Latency After Shrinking Cluster

Hm nodetool decommision performs the streamout of the replicated data, and you 
said that was apparently without error...

But if you dropped three nodes in one AZ/rack on a five node with RF3, then we 
have a missing RF factor unless NetworkTopologyStrategy fails over to another 
AZ. But that would also entail cross-az streaming and queries and repair.

On Wed, Feb 21, 2018 at 3:30 PM, Carl Mueller <carl.muel...@smartthings.com> 
wrote:
sorry for the idiot questions... 

data was allowed to fully rebalance/repair/drain before the next node was taken 
off?

did you take 1 off per rack/AZ?


On Wed, Feb 21, 2018 at 12:29 PM, Fred Habash <fmhab...@gmail.com> wrote:
One node at a time 

On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com> wrote:
What is your replication factor? 
Single datacenter, three availability zones, is that right?
You removed one node at a time or three at once?

On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
We have had a 15 node cluster across three zones and cluster repairs using 
‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the 
cluster to 12. Since then, same repair job has taken up to 12 hours to finish 
and most times, it never does. 
 
More importantly, at some point during the repair cycle, we see read latencies 
jumping to 1-2 seconds and applications immediately notice the impact.
 
stream_throughput_outbound_megabits_per_sec is set at 200 and 
compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is around 
~500GB at 44% usage. 
 
When shrinking the cluster, the ‘nodetool decommision’ was eventless. It 
completed successfully with no issues.
 
What could possibly cause repairs to c

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Carl Mueller
Hm nodetool decommision performs the streamout of the replicated data, and
you said that was apparently without error...

But if you dropped three nodes in one AZ/rack on a five node with RF3, then
we have a missing RF factor unless NetworkTopologyStrategy fails over to
another AZ. But that would also entail cross-az streaming and queries and
repair.

On Wed, Feb 21, 2018 at 3:30 PM, Carl Mueller <carl.muel...@smartthings.com>
wrote:

> sorry for the idiot questions...
>
> data was allowed to fully rebalance/repair/drain before the next node was
> taken off?
>
> did you take 1 off per rack/AZ?
>
>
> On Wed, Feb 21, 2018 at 12:29 PM, Fred Habash <fmhab...@gmail.com> wrote:
>
>> One node at a time
>>
>> On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com>
>> wrote:
>>
>>> What is your replication factor?
>>> Single datacenter, three availability zones, is that right?
>>> You removed one node at a time or three at once?
>>>
>>> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
>>>
>>>> We have had a 15 node cluster across three zones and cluster repairs
>>>> using ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk
>>>> the cluster to 12. Since then, same repair job has taken up to 12 hours to
>>>> finish and most times, it never does.
>>>>
>>>>
>>>>
>>>> More importantly, at some point during the repair cycle, we see read
>>>> latencies jumping to 1-2 seconds and applications immediately notice the
>>>> impact.
>>>>
>>>>
>>>>
>>>> stream_throughput_outbound_megabits_per_sec is set at 200 and
>>>> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is
>>>> around ~500GB at 44% usage.
>>>>
>>>>
>>>>
>>>> When shrinking the cluster, the ‘nodetool decommision’ was eventless.
>>>> It completed successfully with no issues.
>>>>
>>>>
>>>>
>>>> What could possibly cause repairs to cause this impact following
>>>> cluster downsizing? Taking three nodes out does not seem compatible with
>>>> such a drastic effect on repair and read latency.
>>>>
>>>>
>>>>
>>>> Any expert insights will be appreciated.
>>>>
>>>> 
>>>> Thank you
>>>>
>>>>
>>>>
>>>
>>>
>


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Carl Mueller
sorry for the idiot questions...

data was allowed to fully rebalance/repair/drain before the next node was
taken off?

did you take 1 off per rack/AZ?


On Wed, Feb 21, 2018 at 12:29 PM, Fred Habash <fmhab...@gmail.com> wrote:

> One node at a time
>
> On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com>
> wrote:
>
>> What is your replication factor?
>> Single datacenter, three availability zones, is that right?
>> You removed one node at a time or three at once?
>>
>> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
>>
>>> We have had a 15 node cluster across three zones and cluster repairs
>>> using ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk
>>> the cluster to 12. Since then, same repair job has taken up to 12 hours to
>>> finish and most times, it never does.
>>>
>>>
>>>
>>> More importantly, at some point during the repair cycle, we see read
>>> latencies jumping to 1-2 seconds and applications immediately notice the
>>> impact.
>>>
>>>
>>>
>>> stream_throughput_outbound_megabits_per_sec is set at 200 and
>>> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is
>>> around ~500GB at 44% usage.
>>>
>>>
>>>
>>> When shrinking the cluster, the ‘nodetool decommision’ was eventless. It
>>> completed successfully with no issues.
>>>
>>>
>>>
>>> What could possibly cause repairs to cause this impact following cluster
>>> downsizing? Taking three nodes out does not seem compatible with such a
>>> drastic effect on repair and read latency.
>>>
>>>
>>>
>>> Any expert insights will be appreciated.
>>>
>>> 
>>> Thank you
>>>
>>>
>>>
>>
>>


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Fred Habash
RF of 3 with three racs AZ's in a single region.

On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com>
wrote:

> What is your replication factor?
> Single datacenter, three availability zones, is that right?
> You removed one node at a time or three at once?
>
> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
>
>> We have had a 15 node cluster across three zones and cluster repairs
>> using ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk
>> the cluster to 12. Since then, same repair job has taken up to 12 hours to
>> finish and most times, it never does.
>>
>>
>>
>> More importantly, at some point during the repair cycle, we see read
>> latencies jumping to 1-2 seconds and applications immediately notice the
>> impact.
>>
>>
>>
>> stream_throughput_outbound_megabits_per_sec is set at 200 and
>> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is
>> around ~500GB at 44% usage.
>>
>>
>>
>> When shrinking the cluster, the ‘nodetool decommision’ was eventless. It
>> completed successfully with no issues.
>>
>>
>>
>> What could possibly cause repairs to cause this impact following cluster
>> downsizing? Taking three nodes out does not seem compatible with such a
>> drastic effect on repair and read latency.
>>
>>
>>
>> Any expert insights will be appreciated.
>>
>> 
>> Thank you
>>
>>
>>
>
>


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Fred Habash
One node at a time

On Feb 21, 2018 10:23 AM, "Carl Mueller" <carl.muel...@smartthings.com>
wrote:

> What is your replication factor?
> Single datacenter, three availability zones, is that right?
> You removed one node at a time or three at once?
>
> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
>
>> We have had a 15 node cluster across three zones and cluster repairs
>> using ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk
>> the cluster to 12. Since then, same repair job has taken up to 12 hours to
>> finish and most times, it never does.
>>
>>
>>
>> More importantly, at some point during the repair cycle, we see read
>> latencies jumping to 1-2 seconds and applications immediately notice the
>> impact.
>>
>>
>>
>> stream_throughput_outbound_megabits_per_sec is set at 200 and
>> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is
>> around ~500GB at 44% usage.
>>
>>
>>
>> When shrinking the cluster, the ‘nodetool decommision’ was eventless. It
>> completed successfully with no issues.
>>
>>
>>
>> What could possibly cause repairs to cause this impact following cluster
>> downsizing? Taking three nodes out does not seem compatible with such a
>> drastic effect on repair and read latency.
>>
>>
>>
>> Any expert insights will be appreciated.
>>
>> 
>> Thank you
>>
>>
>>
>
>


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Jeff Jirsa
nodetool cfhistograms, nodetool compactionstats would be helpful

Compaction is probably behind from streaming, and reads are touching many 
sstables.

-- 
Jeff Jirsa


> On Feb 21, 2018, at 8:20 AM, Fd Habash <fmhab...@gmail.com> wrote:
> 
> We have had a 15 node cluster across three zones and cluster repairs using 
> ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the 
> cluster to 12. Since then, same repair job has taken up to 12 hours to finish 
> and most times, it never does.
>  
> More importantly, at some point during the repair cycle, we see read 
> latencies jumping to 1-2 seconds and applications immediately notice the 
> impact.
>  
> stream_throughput_outbound_megabits_per_sec is set at 200 and 
> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is around 
> ~500GB at 44% usage.
>  
> When shrinking the cluster, the ‘nodetool decommision’ was eventless. It 
> completed successfully with no issues.
>  
> What could possibly cause repairs to cause this impact following cluster 
> downsizing? Taking three nodes out does not seem compatible with such a 
> drastic effect on repair and read latency.
>  
> Any expert insights will be appreciated.
> 
> Thank you
>  


Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Carl Mueller
What is your replication factor?
Single datacenter, three availability zones, is that right?
You removed one node at a time or three at once?

On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash <fmhab...@gmail.com> wrote:

> We have had a 15 node cluster across three zones and cluster repairs using
> ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the
> cluster to 12. Since then, same repair job has taken up to 12 hours to
> finish and most times, it never does.
>
>
>
> More importantly, at some point during the repair cycle, we see read
> latencies jumping to 1-2 seconds and applications immediately notice the
> impact.
>
>
>
> stream_throughput_outbound_megabits_per_sec is set at 200 and
> compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is
> around ~500GB at 44% usage.
>
>
>
> When shrinking the cluster, the ‘nodetool decommision’ was eventless. It
> completed successfully with no issues.
>
>
>
> What could possibly cause repairs to cause this impact following cluster
> downsizing? Taking three nodes out does not seem compatible with such a
> drastic effect on repair and read latency.
>
>
>
> Any expert insights will be appreciated.
>
> 
> Thank you
>
>
>


Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Fd Habash
We have had a 15 node cluster across three zones and cluster repairs using 
‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the 
cluster to 12. Since then, same repair job has taken up to 12 hours to finish 
and most times, it never does. 

More importantly, at some point during the repair cycle, we see read latencies 
jumping to 1-2 seconds and applications immediately notice the impact.

stream_throughput_outbound_megabits_per_sec is set at 200 and 
compaction_throughput_mb_per_sec at 64. The /data dir on the nodes is around 
~500GB at 44% usage. 

When shrinking the cluster, the ‘nodetool decommision’ was eventless. It 
completed successfully with no issues.

What could possibly cause repairs to cause this impact following cluster 
downsizing? Taking three nodes out does not seem compatible with such a drastic 
effect on repair and read latency. 

Any expert insights will be appreciated. 

Thank you



Re: Nodetool repair -pr

2017-09-29 Thread Blake Eggleston
It will on 2.2 and higher, yes.

Also, just want to point out that it would be worth it for you to compare how 
long incremental repairs take vs full repairs in your cluster. There are some 
problems (which are fixed in 4.0) that can cause significant overstreaming when 
using incremental repair.

On September 28, 2017 at 11:46:47 AM, Dmitry Buzolin (dbuz5ga...@gmail.com) 
wrote:

Hi All, 

Can someone confirm if 

"nodetool repair -pr -j2" does run with -inc too? I see the docs mention -inc 
is set by default, but I am not sure if it is enabled when -pr option is used. 

Thanks! 
- 
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
For additional commands, e-mail: user-h...@cassandra.apache.org 



Nodetool repair -pr

2017-09-28 Thread Dmitry Buzolin
Hi All,

Can someone confirm if

"nodetool repair -pr -j2" does run with -inc too? I see the docs mention -inc 
is set by default, but I am not sure if it is enabled when -pr option is used.

Thanks!
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



strange node load decrease after nodetool repair -pr

2016-10-20 Thread Oleg Krayushkin
Hi. After I've run token-ranged repair from node at 12.5.13.125 with

nodetool repair -full -st ${start_tokens[i]} -et ${end_tokens[i]}

on every token range, I got this node load:

--  Address   Load   Tokens  Owns   Rack
UN  12.5.13.141   23.94 GB   256 32.3%  rack1
DN  12.5.13.125   34.71 GB   256 31.8%  rack1
UN  12.5.13.4629.01 GB   512 58.1%  rack1
UN  12.5.13.228   41.17 GB   512 58.5%  rack1
UN  12.5.13.3445.93 GB   512 59.8%  rack1
UN  12.5.13.8242.05 GB   512 59.4%  rack1

Then I've run partitioner-range repair from the same node with

nodetool repair -full -pr

And unexpectedly I got such a different load:

--  Address   Load   Tokens  Owns   Rack
UN  12.5.13.141   22.93 GB   256 32.3%  rack1
UN  12.5.13.125   30.94 GB   256 31.8%  rack1
UN  12.5.13.4627.38 GB   512 58.1%  rack1
UN  12.5.13.228   39.51 GB   512 58.5%  rack1
UN  12.5.13.3441.58 GB   512 59.8%  rack1
UN  12.5.13.8233.9 GB512 59.4%  rack1

What are posible reasons of such load decrease after last repair? Maybe
some compaction, that were not done after token-ranged repairs? But at
12.5.13.82 gone about 8GB!

Additional info:

   - There were no writes to db during these periods.
   - All repair operations completed without errors, exceptions or fails.
   - Before the first repair I've done sstablescrub on every node -- maybe
   this gives a clue?
   - cassandra version is 3.0.8

-- 

Oleg Krayushkin


RE: nodetool repair -pr enough in this scenario?

2012-06-05 Thread Viktor Jevdokimov
Understand simple mechanics first, decide how to act later.

Without -PR there's no difference from which host to run repair, it runs for 
the whole 100% range, from start to end, the whole cluster, all nodes, at once.

With -PR it runs only for a primary range of a node you are running a repair.
Let say you have simple ring of 3 nodes with RF=2 and ranges (per node) N1=C-A, 
N2=A-B, N3=B-C (node tokens are N1=A, N2=B, N3=C). No rack, no DC aware.
So running repair with -PR on node N2 will only repair a range A-B, for which 
node N2 is a primary and N3 is a backup. N2 and N3 will synchronize A-B range 
one with other. For other ranges you need to run on other nodes.

Without -PR running on any node will repair all ranges, A-B, B-C, C-A. A node 
you run a repair without -PR is just a repair coordinator, so no difference, 
which one will be next time.




Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider
What is Adform: watch this short videohttp://vimeo.com/adform/display

[Adform News] http://www.adform.com


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: David Daeschler [mailto:david.daesch...@gmail.com]
Sent: Tuesday, June 05, 2012 08:59
To: user@cassandra.apache.org
Subject: nodetool repair -pr enough in this scenario?

Hello,

Currently I have a 4 node cassandra cluster on CentOS64. I have been running 
nodetool repair (no -pr option) on a weekly schedule like:

Host1: Tue, Host2: Wed, Host3: Thu, Host4: Fri

In this scenario, if I were to add the -pr option, would this still be 
sufficient to prevent forgotten deletes and properly maintain consistency?

Thank you,
- David
inline: signature-logo29.png

Re: nodetool repair -pr enough in this scenario?

2012-06-05 Thread R. Verlangen
In your case -pr would be just fine (see Viktor's explanation).

2012/6/5 Viktor Jevdokimov viktor.jevdoki...@adform.com

  Understand simple mechanics first, decide how to act later.

 ** **

 Without –PR there’s no difference from which host to run repair, it runs
 for the whole 100% range, from start to end, the whole cluster, all nodes,
 at once.

 ** **

 With –PR it runs only for a primary range of a node you are running a
 repair.

 Let say you have simple ring of 3 nodes with RF=2 and ranges (per node)
 N1=C-A, N2=A-B, N3=B-C (node tokens are N1=A, N2=B, N3=C). No rack, no DC
 aware.

 So running repair with –PR on node N2 will only repair a range A-B, for
 which node N2 is a primary and N3 is a backup. N2 and N3 will synchronize
 A-B range one with other. For other ranges you need to run on other nodes.
 

 ** **

 Without –PR running on any node will repair all ranges, A-B, B-C, C-A. A
 node you run a repair without –PR is just a repair coordinator, so no
 difference, which one will be next time.

 ** **

 ** **


Best regards / Pagarbiai
 *Viktor Jevdokimov*
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063, Fax +370 5 261 0453
 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
 Follow us on Twitter: @adforminsider http://twitter.com/#!/adforminsider
 What is Adform: watch this short video http://vimeo.com/adform/display
  [image: Adform News] http://www.adform.com

 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.

   *From:* David Daeschler [mailto:david.daesch...@gmail.com]
 *Sent:* Tuesday, June 05, 2012 08:59
 *To:* user@cassandra.apache.org
 *Subject:* nodetool repair -pr enough in this scenario?

 ** **

 Hello,

 ** **

 Currently I have a 4 node cassandra cluster on CentOS64. I have been
 running nodetool repair (no -pr option) on a weekly schedule like:

 ** **

 Host1: Tue, Host2: Wed, Host3: Thu, Host4: Fri

 ** **

 In this scenario, if I were to add the -pr option, would this still be
 sufficient to prevent forgotten deletes and properly maintain consistency?
 

 ** **

 Thank you,
 - David 




-- 
With kind regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.
signature-logo29.png

Re: nodetool repair -pr enough in this scenario?

2012-06-05 Thread Sylvain Lebresne
On Tue, Jun 5, 2012 at 8:44 AM, Viktor Jevdokimov 
viktor.jevdoki...@adform.com wrote:

  Understand simple mechanics first, decide how to act later.

 ** **

 Without –PR there’s no difference from which host to run repair, it runs
 for the whole 100% range, from start to end, the whole cluster, all nodes,
 at once.


That's not exactly true. A repair without -pr will repair all the ranges of
the node on which repair is ran. So it will only repair the ranges that the
node is a replica for. It will *not* repair the whole cluster (unless the
replication factor is equal to the number of nodes in the cluster but
that's a degenerate case). And hence it does matter on which host repair is
run (it always matter, whether you use -pr or not).

In general you want to use repair without -pr in case where you want to
repair a specific node. Typically, if a node was dead for a reasonably long
time, you may want to run a repair (without -pr) on that specific node to
have him catch up faster (faster that if you were only relying on
read-repair and hinted-handoff).

For repairing a whole cluster, as is the case for the weekly scheduled
repairs in the initial question, you want to use -rp. You *do not* want to
use repair without -pr in that case. You do not because for that task using
-pr is more efficient (and to be clear, not using -pr won't cause problems,
but it does is less efficient).

--
Sylvain





 With –PR it runs only for a primary range of a node you are running a
 repair.

 Let say you have simple ring of 3 nodes with RF=2 and ranges (per node)
 N1=C-A, N2=A-B, N3=B-C (node tokens are N1=A, N2=B, N3=C). No rack, no DC
 aware.

 So running repair with –PR on node N2 will only repair a range A-B, for
 which node N2 is a primary and N3 is a backup. N2 and N3 will synchronize
 A-B range one with other. For other ranges you need to run on other nodes.
 

 ** **

 Without –PR running on any node will repair all ranges, A-B, B-C, C-A. A
 node you run a repair without –PR is just a repair coordinator, so no
 difference, which one will be next time.

 ** **

 ** **


Best regards / Pagarbiai
 *Viktor Jevdokimov*
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063, Fax +370 5 261 0453
 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
 Follow us on Twitter: @adforminsider http://twitter.com/#!/adforminsider
 What is Adform: watch this short video http://vimeo.com/adform/display
  [image: Adform News] http://www.adform.com

 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.

   *From:* David Daeschler [mailto:david.daesch...@gmail.com]
 *Sent:* Tuesday, June 05, 2012 08:59
 *To:* user@cassandra.apache.org
 *Subject:* nodetool repair -pr enough in this scenario?

 ** **

 Hello,

 ** **

 Currently I have a 4 node cassandra cluster on CentOS64. I have been
 running nodetool repair (no -pr option) on a weekly schedule like:

 ** **

 Host1: Tue, Host2: Wed, Host3: Thu, Host4: Fri

 ** **

 In this scenario, if I were to add the -pr option, would this still be
 sufficient to prevent forgotten deletes and properly maintain consistency?
 

 ** **

 Thank you,
 - David 

signature-logo29.png

RE: nodetool repair -pr enough in this scenario?

2012-06-05 Thread Viktor Jevdokimov
But in any case, repair is a two way process?
I mean that repair without -PR on node N1 will repair N1 and N2 and N3, because 
N2 is a replica of N1 range and N1 is a replica of N3 range?
And if there're more ranges, that not belongs to N1, that ranges and nodes will 
not be repaired?


Am I understood correctly, that repair with or without -PR is not a repair 
selected node process, but synchronize data range(s) between replicas 
process?
Single DC scenario:
With -PR: synchronize data for only primary data range of selected node between 
all nodes for that range (max number of nodes for the range = RF).
Without -PR: synchronize data for all data ranges of selected node (primary and 
replica) between all nodes of that ranges (max number of nodes for the ranges = 
RF*RF). Not effective since ranges overlaps, the same ranges will be 
synchronized more than once (max = RF times).
Multiple DC with 100% data range in each DC scenario: the same, only RF = sum 
of RF from all DC's.
Is that correct?

Finally - is this process for SSTables only, excluding memtables and hints?





Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider
What is Adform: watch this short videohttp://vimeo.com/adform/display

[Adform News] http://www.adform.com


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: Sylvain Lebresne [mailto:sylv...@datastax.com]
Sent: Tuesday, June 05, 2012 11:02
To: user@cassandra.apache.org
Subject: Re: nodetool repair -pr enough in this scenario?

On Tue, Jun 5, 2012 at 8:44 AM, Viktor Jevdokimov 
viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com wrote:
Understand simple mechanics first, decide how to act later.

Without -PR there's no difference from which host to run repair, it runs for 
the whole 100% range, from start to end, the whole cluster, all nodes, at once.

That's not exactly true. A repair without -pr will repair all the ranges of the 
node on which repair is ran. So it will only repair the ranges that the node is 
a replica for. It will *not* repair the whole cluster (unless the replication 
factor is equal to the number of nodes in the cluster but that's a degenerate 
case). And hence it does matter on which host repair is run (it always matter, 
whether you use -pr or not).

In general you want to use repair without -pr in case where you want to repair 
a specific node. Typically, if a node was dead for a reasonably long time, you 
may want to run a repair (without -pr) on that specific node to have him catch 
up faster (faster that if you were only relying on read-repair and 
hinted-handoff).

For repairing a whole cluster, as is the case for the weekly scheduled repairs 
in the initial question, you want to use -rp. You *do not* want to use repair 
without -pr in that case. You do not because for that task using -pr is more 
efficient (and to be clear, not using -pr won't cause problems, but it does is 
less efficient).

--
Sylvain



With -PR it runs only for a primary range of a node you are running a repair.
Let say you have simple ring of 3 nodes with RF=2 and ranges (per node) N1=C-A, 
N2=A-B, N3=B-C (node tokens are N1=A, N2=B, N3=C). No rack, no DC aware.
So running repair with -PR on node N2 will only repair a range A-B, for which 
node N2 is a primary and N3 is a backup. N2 and N3 will synchronize A-B range 
one with other. For other ranges you need to run on other nodes.

Without -PR running on any node will repair all ranges, A-B, B-C, C-A. A node 
you run a repair without -PR is just a repair coordinator, so no difference, 
which one will be next time.



Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com
Phone: +370 5 212 3063tel:%2B370%205%20212%203063, Fax +370 5 261 
0453tel:%2B370%205%20261%200453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider
What is Adform: watch this short videohttp://vimeo.com/adform/display

[Adform News]http://www.adform.com


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must

Re: nodetool repair -pr enough in this scenario?

2012-06-05 Thread aaron morton
-pr is a new feature added in 1.0. It was added for efficiency, not 
functionality. With -pr repair does 1/RF the work it does without it.

 Am I understood correctly, that “repair” with or without –PR is not a “repair 
 selected node” process, but “synchronize data range(s) between replicas” 
 process?
Yes. 
But if you have a node that has been down for a few hours you may want to get 
it's primary range repaired quickly. 

Or as sylvain says, if you are running repair on every node in the cluster you 
can use -pr to reduce the duration of the repair operation.  It would have the 
same effect as running repair without -pr on every RF'th node in the cluster. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/06/2012, at 9:19 PM, Viktor Jevdokimov wrote:

 But in any case, repair is a two way process?
 I mean that repair without –PR on node N1 will repair N1 and N2 and N3, 
 because N2 is a replica of N1 range and N1 is a replica of N3 range?
 And if there’re more ranges, that not belongs to N1, that ranges and nodes 
 will not be repaired?
  
  
 Am I understood correctly, that “repair” with or without –PR is not a “repair 
 selected node” process, but “synchronize data range(s) between replicas” 
 process?
 Single DC scenario:
 With –PR: synchronize data for only primary data range of selected node 
 between all nodes for that range (max number of nodes for the range = RF).
 Without –PR: synchronize data for all data ranges of selected node (primary 
 and replica) between all nodes of that ranges (max number of nodes for the 
 ranges = RF*RF). Not effective since ranges overlaps, the same ranges will be 
 synchronized more than once (max = RF times).
 Multiple DC with 100% data range in each DC scenario: the same, only RF = sum 
 of RF from all DC’s.
 Is that correct?
  
 Finally – is this process for SSTables only, excluding memtables and hints?
  
  
  
 
 
 Best regards / Pagarbiai
 Viktor Jevdokimov
 Senior Developer
 
 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063, Fax +370 5 261 0453
 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
 Follow us on Twitter: @adforminsider
 What is Adform: watch this short video
 signature-logo29.png
 
 Disclaimer: The information contained in this message and attachments is 
 intended solely for the attention and use of the named addressee and may be 
 confidential. If you are not the intended recipient, you are reminded that 
 the information remains the property of the sender. You must not use, 
 disclose, distribute, copy, print or rely on this e-mail. If you have 
 received this message in error, please contact the sender immediately and 
 irrevocably delete this message and any copies.
 
 From: Sylvain Lebresne [mailto:sylv...@datastax.com] 
 Sent: Tuesday, June 05, 2012 11:02
 To: user@cassandra.apache.org
 Subject: Re: nodetool repair -pr enough in this scenario?
  
 On Tue, Jun 5, 2012 at 8:44 AM, Viktor Jevdokimov 
 viktor.jevdoki...@adform.com wrote:
 Understand simple mechanics first, decide how to act later.
  
 Without –PR there’s no difference from which host to run repair, it runs for 
 the whole 100% range, from start to end, the whole cluster, all nodes, at 
 once.
  
 That's not exactly true. A repair without -pr will repair all the ranges of 
 the node on which repair is ran. So it will only repair the ranges that the 
 node is a replica for. It will *not* repair the whole cluster (unless the 
 replication factor is equal to the number of nodes in the cluster but that's 
 a degenerate case). And hence it does matter on which host repair is run (it 
 always matter, whether you use -pr or not).
  
 In general you want to use repair without -pr in case where you want to 
 repair a specific node. Typically, if a node was dead for a reasonably long 
 time, you may want to run a repair (without -pr) on that specific node to 
 have him catch up faster (faster that if you were only relying on read-repair 
 and hinted-handoff).
  
 For repairing a whole cluster, as is the case for the weekly scheduled 
 repairs in the initial question, you want to use -rp. You *do not* want to 
 use repair without -pr in that case. You do not because for that task using 
 -pr is more efficient (and to be clear, not using -pr won't cause problems, 
 but it does is less efficient).
  
 --
 Sylvain
  
  
  
 With –PR it runs only for a primary range of a node you are running a repair.
 Let say you have simple ring of 3 nodes with RF=2 and ranges (per node) 
 N1=C-A, N2=A-B, N3=B-C (node tokens are N1=A, N2=B, N3=C). No rack, no DC 
 aware.
 So running repair with –PR on node N2 will only repair a range A-B, for which 
 node N2 is a primary and N3 is a backup. N2 and N3 will synchronize A-B range 
 one with other. For other ranges you need to run on other nodes.
  
 Without –PR running on any node will repair all ranges, A-B, B-C, C-A. A node 
 you run a repair without –PR is just a repair coordinator, so

Re: nodetool repair -pr enough in this scenario?

2012-06-05 Thread David Daeschler
Thank you for all the replies. It has been enlightening to read. I think I
now have a better idea of repair, ranges, replicas and how the data is
distributed. It also seems that using -pr would be the best way to go in my
scenario with 1.x+


Thank you for all the feedback. Glad to see such an active community around
Cassandra.
- David


nodetool repair -pr enough in this scenario?

2012-06-04 Thread David Daeschler
Hello,

Currently I have a 4 node cassandra cluster on CentOS64. I have been
running nodetool repair (no -pr option) on a weekly schedule like:

Host1: Tue, Host2: Wed, Host3: Thu, Host4: Fri

In this scenario, if I were to add the -pr option, would this still be
sufficient to prevent forgotten deletes and properly maintain consistency?

Thank you,
- David