Re: High disk io read load

2017-02-24 Thread Benjamin Roth
It was only the schema change. 2017-02-24 19:18 GMT+01:00 kurt greaves : > How many CFs are we talking about here? Also, did the script also kick off > the scrubs or was this purely from changing the schemas? > ​ > -- Benjamin Roth Prokurist Jaumo GmbH · www.jaumo.com

Re: High disk io read load

2017-02-24 Thread kurt greaves
How many CFs are we talking about here? Also, did the script also kick off the scrubs or was this purely from changing the schemas? ​

Re: High disk io read load

2017-02-20 Thread Benjamin Roth
Hah! Found the problem! After setting read_ahead to 0 and compression chunk size to 4kb on all CFs, the situation was PERFECT (nearly, please see below)! I scrubbed some CFs but not the whole dataset, yet. I knew it was not too few RAM. Some stats: - Latency of a quite large CF:

Re: High disk io read load

2017-02-20 Thread Bhuvan Rawal
Hi Benjamin, Yes, Read ahead of 8 would imply more IO count from disk but it should not cause more data read off the disk as is happening in your case. One probable reason for high disk io would be because the 512 vnode has less page to RAM ratio of 22% (100G buff /437G data) as compared to 46%

Re: High disk io read load

2017-02-19 Thread Benjamin Roth
This is the output of sar: https://gist.github.com/anonymous/9545fb69fbb28a20dc99b2ea5e14f4cd It seems to me that there es not enough page cache to

Re: High disk io read load

2017-02-19 Thread Bhuvan Rawal
Hi Edward, This could have been a valid case here but if hotspots indeed existed then along with really high disk io , the node should have been doing proportionate high network io as well. - higher queries per second as well. But from the output shared by Benjamin that doesnt appear to be the

Re: High disk io read load

2017-02-19 Thread Edward Capriolo
On Sat, Feb 18, 2017 at 3:35 PM, Benjamin Roth wrote: > We are talking about a read IO increase of over 2000% with 512 tokens > compared to 256 tokens. 100% increase would be linear which would be > perfect. 200% would even okay, taking the RAM/Load ratio for caching

Re: High disk io read load

2017-02-18 Thread Bhuvan Rawal
This looks fine, 8k read ahead as you mentioned. Doesnt look like an issue of data model as well since reads in this https://cl.ly/2c3Z1u2k0u2I appear balanced. In most possibility, this looks like an issue with new node configuration to me. The fact that you have really less data going out of

Re: High disk io read load

2017-02-18 Thread Benjamin Roth
Just for the record, that's what dstat looks like while CS is starting: root@cas10:~# dstat -lrnv 10 ---load-avg--- --io/total- -net/total- ---procs--- --memory-usage- ---paging-- -dsk/total- ---system-- total-cpu-usage 1m 5m 15m | read writ| recv send|run blk new| used

Re: High disk io read load

2017-02-18 Thread Benjamin Roth
256 tokens: root@cas9:/sys/block/dm-0# blockdev --report RORA SSZ BSZ StartSecSize Device rw 256 512 4096 067108864 /dev/ram0 rw 256 512 4096 067108864 /dev/ram1 rw 256 512 4096 067108864 /dev/ram2 rw

Re: High disk io read load

2017-02-18 Thread Bhuvan Rawal
Hi Ben, If its same on both machines then something else could be the issue. We faced high disk io due to misconfigured read ahead which resulted in high amount of disk io for comparatively insignificant network transfer. Can you post output of blockdev --report for a normal node and 512 token

Re: High disk io read load

2017-02-18 Thread Benjamin Roth
cat /sys/block/sda/queue/read_ahead_kb => 8 On all CS nodes. Is that what you mean? 2017-02-18 21:32 GMT+01:00 Bhuvan Rawal : > Hi Benjamin, > > What is the disk read ahead on both nodes? > > Regards, > Bhuvan > > On Sun, Feb 19, 2017 at 1:58 AM, Benjamin Roth

Re: High disk io read load

2017-02-18 Thread Benjamin Roth
We are talking about a read IO increase of over 2000% with 512 tokens compared to 256 tokens. 100% increase would be linear which would be perfect. 200% would even okay, taking the RAM/Load ratio for caching into account. But > 20x the read IO is really incredible. The nodes are configured with

Re: High disk io read load

2017-02-18 Thread Bhuvan Rawal
Hi Benjamin, What is the disk read ahead on both nodes? Regards, Bhuvan On Sun, Feb 19, 2017 at 1:58 AM, Benjamin Roth wrote: > This is status of the largest KS of these both nodes: > UN 10.23.71.10 437.91 GiB 512 49.1% >

Re: High disk io read load

2017-02-18 Thread Benjamin Roth
This is status of the largest KS of these both nodes: UN 10.23.71.10 437.91 GiB 512 49.1% 2679c3fa-347e-4845-bfc1-c4d0bc906576 RAC1 UN 10.23.71.9 246.99 GiB 256 28.3% 2804ef8a-26c8-4d21-9e12-01e8b6644c2f RAC1 So roughly as expected. 2017-02-17 23:07 GMT+01:00 kurt

Re: High disk io read load

2017-02-17 Thread kurt greaves
what's the Owns % for the relevant keyspace from nodetool status?

Re: High disk io read load

2017-02-17 Thread Benjamin Roth
Hi Nate, See here dstat results: https://gist.github.com/brstgt/216c662b525a9c5b653bbcd8da5b3fcb Network volume does not correspond to Disk IO, not even close. @heterogenous vnode count: I did this to test how load behaves on a new server class we ordered for CS. The new nodes had much faster

Re: High disk io read load

2017-02-16 Thread Nate McCall
> - Node A has 512 tokens and Node B 256. So it has double the load (data). > - Node A also has 2 SSDs, Node B only 1 SSD (according to load) > I very rarely see heterogeneous vnode counts in the same cluster. I would almost guarantee you are the only one doing this with MVs as well. That said,

Re: High disk io read load

2017-02-16 Thread Edward Capriolo
On Thu, Feb 16, 2017 at 12:38 AM, Benjamin Roth wrote: > It doesn't really look like that: > https://cl.ly/2c3Z1u2k0u2I > > Thats the ReadLatency.count metric aggregated by host which represents the > actual read operations, correct? > > 2017-02-15 23:01 GMT+01:00 Edward

Re: High disk io read load

2017-02-15 Thread Benjamin Roth
Erm sorry, forgot to mention. In this case "cas10" is Node A with 512 tokens and "cas9" Node B with 256 tokens. 2017-02-16 6:38 GMT+01:00 Benjamin Roth : > It doesn't really look like that: > https://cl.ly/2c3Z1u2k0u2I > > Thats the ReadLatency.count metric aggregated by

Re: High disk io read load

2017-02-15 Thread Benjamin Roth
It doesn't really look like that: https://cl.ly/2c3Z1u2k0u2I Thats the ReadLatency.count metric aggregated by host which represents the actual read operations, correct? 2017-02-15 23:01 GMT+01:00 Edward Capriolo : > I think it has more than double the load. It is double

Re: High disk io read load

2017-02-15 Thread Edward Capriolo
I think it has more than double the load. It is double the data. More read repair chances. More load can swing it's way during node failures etc. On Wednesday, February 15, 2017, Benjamin Roth wrote: > Hi there, > > Following situation in cluster with 10 nodes: > Node