Hi everyone,
TL;DR
Should LCS be changed to always prefer an STCS compaction in L0 if it's
falling behind? Assuming that STCS in L0 is enabled.
Currently LCS seems to check if there is a possible L0->L1 compaction
before checking if it's falling behind, which in our case used between
15-30% of the compaction thread CPU.
TL;DR
So first some background:
We have a Apache Cassandra 2.2 cluster running with a high load. In that
cluster there is a table with a moderate amount of writes per second
that is using LeveledCompactionStrategy. The test was to run repair on
that table while we monitored the cluster through JMC and with Flight
Recordings enabled. This resulted in a large amount of sstables for that
table, which I assume others have experienced as well. In this case I
think it was between 15-20k.
From the Flight Recording one thing we saw was that 15-30% of the CPU
time in each of the compaction threads was spent on
"getNextBackgroundTask()" which retrieves the next compaction job. With
some further investigation this seems to mostly be when it's checking
for overlap in L0 sstables before performing an L0->L1 compaction. There
is a JIRA which seems to be related to this
https://issues.apache.org/jira/browse/CASSANDRA-11571 which we
backported to 2.2 and tested. In our testing it seemed to improve the
situation but it was still using noticeable CPU.
My interpretation of the current logic of LCS is (if STCS in L0 is enabled):
1. Check each level (L1+)
- If a L1+ compaction is needed check if L0 is behind and do STCS if
that's the case, otherwise do the L1+ compaction.
2. Check L0 -> L1 compactions and if none is needed/possible check for
STCS in L0.
My proposal is to change this behavior to always check if L0 is far
behind first and do a STCS compaction in that case. This would avoid the
overlap check for L0 -> L1 compactions when L0 is behind and I think it
makes sense since we already prefer STCS to L1+ compactions. This would
not solve the repair situation, but it would lower some of the impact
that repair has on LCS.
For what version this could get in I think trunk would be enough since
compaction is pluggable.
--
Fwd: Footer
Ericsson <http://www.ericsson.com/>
*MARCUS OLSSON *
Software Developer
*Ericsson*
Sweden
marcus.ols...@ericsson.com <mailto:marcus.ols...@ericsson.com>
www.ericsson.com <http://www.ericsson.com>