you did not specify read and write consistency levels, default would be to hit two nodes (one for data, one for digest) with every query. Network load of 50% is not too helpful. 1gbit? 10gbit? 50% of each direction or average of both?
Iowait is not great for a system of this size: assuming that you have 3 vm's on THREE SEPARATE physical systems and WITHOUT network attached storage ... *Daemeon Reiydelle* *email: daeme...@gmail.com <daeme...@gmail.com>* *LI: https://www.linkedin.com/in/daemeonreiydelle/ <https://www.linkedin.com/in/daemeonreiydelle/>* *San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle* "Life should not be a journey to the grave with the intention of arriving safely in a pretty and well preserved body, but rather to skid in broadside in a cloud of smoke, thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!" - Hunter S. Thompson On Fri, Mar 5, 2021 at 6:48 AM Attila Wind <attilaw@swf.technology> wrote: > Hi guys, > > I have a DevOps related question - hope someone here could give some > ideas/pointers... > > We are running a 3 nodes Cassandra cluster > Recently we realized we do have performance issues. And based on > investigation we took it seems our bottleneck is the Cassandra cluster. The > application layer is waiting a lot for Cassandra ops. So queries are > running slow on Cassandra side however due to our monitoring it looks the > Cassandra servers still have lots of free resources... > > The Cassandra machines are virtual machines (we do own the physical hosts > too) built with kvm - with 6 CPU cores (3 physical) and 32GB RAM dedicated > to it. > We are using Ubuntu Linux 18.04 distro - everywhere the same version (the > physical and virtual host) > We are running Cassandra 4.0-alpha4 > > What we see is > > - CPU load is around 20-25% - so we have lots of spare capacity > - iowait is around 2-5% - so disk bandwidth should be fine > - network load is around 50% of the full available bandwidth > - loadavg is max around 4 - 4.5 but typically around 3 (because of the > cpu count 6 should represent 100% load) > > and still, query performance is slow ... and we do not understand what > could hold Cassandra back to fully utilize the server resources... > > We are clearly missing something! > Anyone any idea / tip? > > thanks! > -- > Attila Wind > > http://www.linkedin.com/in/attilaw > Mobile: +49 176 43556932 > > >