Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-23 Thread Fred Habash
. On Feb 21, 2018 1:29 PM, "Fred Habash" wrote: > One node at a time > > On Feb 21, 2018 10:23 AM, "Carl Mueller" > wrote: > >> What is your replication factor? >> Single datacenter, three availability zones, is that right? >> You removed one

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Carl Mueller
Hm nodetool decommision performs the streamout of the replicated data, and you said that was apparently without error... But if you dropped three nodes in one AZ/rack on a five node with RF3, then we have a missing RF factor unless NetworkTopologyStrategy fails over to another AZ. But that would

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Carl Mueller
sorry for the idiot questions... data was allowed to fully rebalance/repair/drain before the next node was taken off? did you take 1 off per rack/AZ? On Wed, Feb 21, 2018 at 12:29 PM, Fred Habash wrote: > One node at a time > > On Feb 21, 2018 10:23 AM, "Carl Mueller"

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Fred Habash
RF of 3 with three racs AZ's in a single region. On Feb 21, 2018 10:23 AM, "Carl Mueller" wrote: > What is your replication factor? > Single datacenter, three availability zones, is that right? > You removed one node at a time or three at once? > > On Wed, Feb 21,

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Fred Habash
One node at a time On Feb 21, 2018 10:23 AM, "Carl Mueller" wrote: > What is your replication factor? > Single datacenter, three availability zones, is that right? > You removed one node at a time or three at once? > > On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Jeff Jirsa
nodetool cfhistograms, nodetool compactionstats would be helpful Compaction is probably behind from streaming, and reads are touching many sstables. -- Jeff Jirsa > On Feb 21, 2018, at 8:20 AM, Fd Habash wrote: > > We have had a 15 node cluster across three zones and

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Carl Mueller
What is your replication factor? Single datacenter, three availability zones, is that right? You removed one node at a time or three at once? On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash wrote: > We have had a 15 node cluster across three zones and cluster repairs using >

Cluster Repairs 'nodetool repair -pr' Cause Severe Increase in Read Latency After Shrinking Cluster

2018-02-21 Thread Fd Habash
We have had a 15 node cluster across three zones and cluster repairs using ‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the cluster to 12. Since then, same repair job has taken up to 12 hours to finish and most times, it never does. More importantly, at some point