Odd CPU utilization spikes on 1 node out of 30 during repair

Oleksandr Shulgin Wed, 26 Sep 2018 00:35:10 -0700

Hello,

On our production cluster of 30 Apache Cassandra 3.0.17 nodes we have
observed that only one node started to show about 2 times the CPU
utilization as compared to the rest (see screenshot): up to 30% vs. ~15% on
average for the other nodes.


This started more or less immediately after repair was started (using
Cassandra Reaper, parallel, non-incremental) and lasted up until we've
restarted this node.  After restart the CPU use is in line with the rest of
nodes.

All other metrics that we are monitoring for these nodes were in line with
the rest of the cluster.

The logs on the node don't show anything odd, no extra warn/error/info
messages, not more minor or major GC runs as compared to other nodes during
the time we were observing this behavior.

What could be the reason for this behavior?  How should we debug it if that
happens next time instead of just restarting?

Cheers,
--
Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Odd CPU utilization spikes on 1 node out of 30 during repair

Reply via email to