Re: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead Latency After Shrinking Cluster
Your partition sizes aren't ridiculous... kinda big cells if there are 4 cells and 12 MB partitions, but still I don't think that is ludicrous. Whelp, I'm out of ideas from my "pay grade". Honestly, with AZ/racks you should have theoretically might have been able to take the nodes off simultaneously, but (Disclaimer) I've never done that. ?Rolling Restart? <-- definitely indicates I have no ideas :-) On Thu, Feb 22, 2018 at 8:15 AM, Fd Habash wrote: > One more observation … > > > > When we compare read latencies between non-prod (where nodes were removed) > to prod clusters, even though the node load as measure by size of /data dir > is similar, yet the read latencies are 5 times slower in the downsized > non-prod cluster. > > > > The only difference we see is that prod reads from 4 sstables whereas > non-prod reads from 5 as cfhistograms. > > > > Non-prod /data size > > - > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 454G 432G 52% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 439G 446G 50% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 368G 518G 42% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 431G 455G 49% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 463G 423G 53% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 406G 479G 46% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 419G 466G 48% /data > > Filesystem Size Used Avail Use% Mounted on > > > > Prod /data size > > > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 352G 534G 40% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 423G 462G 48% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 431G 454G 49% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 442G 443G 50% /data > > Filesystem Size Used Avail Use% Mounted on > > /dev/nvme0n1885G 454G 431G 52% /data > > > > > > Cfhistograms: comparing prod to non-prod > > - > > > > Non-prod > > -- > > 08:21:38Percentile SSTables Write Latency Read > LatencyPartition SizeCell Count > > 08:21:38 (micros) > (micros) (bytes) > > 08:21:3850% 1.00 24.60 > 4055.27 11864 4 > > 08:21:3875% 2.00 35.43 > 14530.76 17084 4 > > 08:21:3895% 4.00126.93 > 89970.66 35425 4 > > 08:21:3898% 5.00219.34 > 155469.30 73457 4 > > 08:21:3899% 5.00219.34 >186563.16105778 4 > > 08:21:38Min 0.00 5.72 > 17.0987 3 > > 08:21:38Max 7.00 20924.30 > 1386179.89 14530764 4 > > > > Prod > > --- > > 07:41:42Percentile SSTables Write Latency Read > LatencyPartition SizeCell Count > > 07:41:42 (micros) > (micros) (bytes) > > 07:41:4250% 1.00 24.60 > 2346.80 11864 4 > > 07:41:4275% 2.00 29.52 > 4866.32 17084 4 > > 07:41:4295% 3.00 73.46 > 14530.76 29521 4 > > 07:41:4298% 4.00182.79 > 25109.16 61214 4 > > 07:41:42 99% 4.00 182.79 > 36157.19 88148 4 > > 07:41:42Min 0.00 9.89 > 20.5087 0 > > 07:41:42Max 5.00219.34 > 155469.30 12108970 4 > > > > > > > Thank you > > > > *From: *Fd Habash > *Sent: *Thursday, February 22, 2018 9:00 AM > *To: *user@cassandra.apache.org > *Subject: *RE: Cl
RE: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead Latency After Shrinking Cluster
One more observation … When we compare read latencies between non-prod (where nodes were removed) to prod clusters, even though the node load as measure by size of /data dir is similar, yet the read latencies are 5 times slower in the downsized non-prod cluster. The only difference we see is that prod reads from 4 sstables whereas non-prod reads from 5 as cfhistograms. Non-prod /data size - Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 454G 432G 52% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 439G 446G 50% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 368G 518G 42% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 431G 455G 49% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 463G 423G 53% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 406G 479G 46% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 419G 466G 48% /data Filesystem Size Used Avail Use% Mounted on Prod /data size Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 352G 534G 40% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 423G 462G 48% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 431G 454G 49% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 442G 443G 50% /data Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1885G 454G 431G 52% /data Cfhistograms: comparing prod to non-prod - Non-prod -- 08:21:38Percentile SSTables Write Latency Read LatencyPartition SizeCell Count 08:21:38 (micros) (micros) (bytes) 08:21:3850% 1.00 24.60 4055.27 11864 4 08:21:3875% 2.00 35.43 14530.76 17084 4 08:21:3895% 4.00126.93 89970.66 35425 4 08:21:3898% 5.00219.34 155469.30 73457 4 08:21:3899% 5.00219.34 186563.16105778 4 08:21:38Min 0.00 5.72 17.0987 3 08:21:38Max 7.00 20924.30 1386179.89 14530764 4 Prod --- 07:41:42Percentile SSTables Write Latency Read LatencyPartition SizeCell Count 07:41:42 (micros) (micros) (bytes) 07:41:4250% 1.00 24.60 2346.80 11864 4 07:41:4275% 2.00 29.52 4866.32 17084 4 07:41:4295% 3.00 73.46 14530.76 29521 4 07:41:4298% 4.00182.79 25109.16 61214 4 07:41:4299% 4.00182.79 36157.19 88148 4 07:41:42Min 0.00 9.89 20.5087 0 07:41:42Max 5.00219.34 155469.30 12108970 4 Thank you From: Fd Habash Sent: Thursday, February 22, 2018 9:00 AM To: user@cassandra.apache.org Subject: RE: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead Latency After Shrinking Cluster “ data was allowed to fully rebalance/repair/drain before the next node was taken off?” -- Judging by the messages, the decomm was healthy. As an example StorageService.java:3425 - Announcing that I have left the ring for 3ms … INFO [RMI TCP Connection(4)-127.0.0.1] 2016-01-07 06:00:52,662 StorageService.java:1191 – DECOMMISSIONED I do not believe repairs were run after each node removal. I’ll double-check. I’m not sure what you mean by ‘rebalance’? How do you check if a node is balanced? Load/size of data dir? As for the drain, there was no need to drain and I believe it is not something you do as part of decomm’ing a node. did you take 1 off per rack/AZ? -- We removed 3 nodes, one from each AZ in sequence Thes