Same bug also affects 2.0.16 - https://issues.apache.org/jira/browse/CASSANDRA-9662
From: Jeff Jirsa Reply-To: <user@cassandra.apache.org> Date: Friday, December 11, 2015 at 9:12 AM To: "user@cassandra.apache.org" Subject: Re: Thousands of pending compactions using STCS There were a few buggy versions in 2.1 (2.1.7, 2.1.8, I believe) that showed this behavior. The number of pending compactions was artificially high, and not meaningful. As long as they number of –Data.db sstables remains normal, compaction is keeping up and you’re fine. - Jeff From: Vasileios Vlachos Reply-To: "user@cassandra.apache.org" Date: Friday, December 11, 2015 at 8:28 AM To: "user@cassandra.apache.org" Subject: Thousands of pending compactions using STCS Hello, We use Nagios and MX4J for the majority of the monitoring we do for Cassandra (version: 2.0.16). For compactions we hit the following URL: http://${cassandra_host}:8081/mbean?objectname=org.apache.cassandra.db%3Atype%3DCompactionManager and check the PendingTasks counter's value. We have noticed that occasionally one or more nodes will report back that they have thousands of pending compactions. We have 11 KS in the cluster and a total of 109 *Data.db files under /var/lib/cassandra/data which gives approximately 10 SSTables per KS. That makes us think that having thousands of pending compactions seems unrealistic given the number of SSTables we seem to have at any given time in each KS/CF directory. The logs show a lot of flush and compaction activity but we don't think that's unusual. Also, each CF is configured to have min_compaction_threshold = 2 and max_compaction_threshold = 32. The two screenshots below show a cluster-wide view of pending compactions. Attached you can find the XML files which contain the data from the MX4J console. And this is from the same graph, but I've selected the time period after 14:00 in order to show what the real compaction activity looks like when not skewed by the incredibly high number of pending compactions as shown above: Has anyone else experienced something similar? Is there something else we can do to see if this is something wrong with our cluster? Thanks in advance for any help! Vasilis
smime.p7s
Description: S/MIME cryptographic signature