Re: underutilized servers

2021-03-06 Thread Attila Wind
Thanks Bowen, * "How do you split?" challenging to answer short, but let me try: physical host has cores from idx 0 - 11 (6 physical and 6 virtual in pairs - they are in pairs as 0,6 belongs together, then 1,7 and then 2,8 and so on) What we do is that in the virt-install command we

Re: underutilized servers

2021-03-06 Thread Bowen Song
Hi Attila, Addressing your data modelling issue is definitely important, and this alone may be enough to solve all the issues you have with Cassandra. * "Since these are VMs, is there any chance they are competing for resources on the same physical host?" We are splitting the physical

Re: underutilized servers

2021-03-06 Thread Bowen Song
Hi Erick, Please allow me to disagree on this. A node dropping reads and writes doesn't always mean the disk is the bottleneck. I have seen the same behaviour when a node had excessive STW GCs and a lots of timeouts, and I have also seen writes get dropped because the size of the mutation

Re: underutilized servers

2021-03-05 Thread Erick Ramirez
The tpstats you posted show that the node is dropping reads and writes which means that your disk can't keep up with the load meaning your disk is the bottleneck. If you haven't already, place data and commitlog on separate disks so they're not competing for the same IO bandwidth. Note that It's

Re: underutilized servers

2021-03-05 Thread daemeon reiydelle
you did not specify read and write consistency levels, default would be to hit two nodes (one for data, one for digest) with every query. Network load of 50% is not too helpful. 1gbit? 10gbit? 50% of each direction or average of both? Iowait is not great for a system of this size: assuming that

Re: underutilized servers

2021-03-05 Thread Attila Wind
Thanks for the answers @Sean and @Bowen !!! First of all, this article described very similar thing we experience - let me share https://www.senticore.com/overcoming-cassandra-write-performance-problems/ we are studying that now Furthermore * yes, we have some level of unbalanced data which

Re: underutilized servers

2021-03-05 Thread Bowen Song
Based on my personal experience, the combination of slow read queries and low CPU usage is often an indicator of bad table schema design (e.g.: large partitions) or bad query (e.g. without partition key). Check the Cassandra logs first, is there any long stop-the-world GC? tombstone warning?

RE: underutilized servers

2021-03-05 Thread Durity, Sean R
Are there specific queries that are slow? Partition-key queries should have read latencies in the single digits of ms (or faster). If that is not what you are seeing, I would first review the data model and queries to make sure that the data is modeled properly for Cassandra. Without metrics, I