Ruoran Wang created CASSANDRA-11599:
---------------------------------------
Summary: When there are a large number of small repaired L0
sstables, compaction is very slow
Key: CASSANDRA-11599
URL: https://issues.apache.org/jira/browse/CASSANDRA-11599
Project: Cassandra
Issue Type: Bug
Environment: 2.1.13
Reporter: Ruoran Wang
This is on 6 node 2.1.13 cluster with leveled compaction strategy.
Initially, I found missing metrics when there is heavy compaction going on.
Because WrappingCompactionStrategy is blocked.
Then I saw a case where compaction got stucked (progress moves dramatically
slow). There are 29k sstables after inc repair where I noticed tons of sstables
are only 200+ Bytes just containing 1 key. Also because of
WrappingCompactionStrategy is blocked.
My guess is, with 8 compaction_executors and a tons of small repaired L0
sstables, the first thread is able to get some (likely 32) sstables to compact.
If this task contains a large range of tokens, the following 7 thread will
iterate through the sstabels trying to find what can be fixed in the meanwhile,
but failing in the end, since those sstable candidates intersects with what is
being compacted by 1st thread. From a series of thread dump, I noticed the
thread that is doing work always get blocked by other 7 threads.
1. I tried to separate an inc-repair into 4 token ranges, which helped keeping
the sstables count down. That seems to be working.
2. Another fix I tried is, replace ageSortedSSTables with a new method
"keyCountSortedSSTables", which small sstables are returned first. (at
org/apache/cassandra/db/compaction/LeveledManifest.java:586). Since there will
be 32 very small sstables, the following condition won't be met
('SSTableReader.getTotalBytes(candidates) > maxSSTableSizeInBytes'), and the
compaction will merge those 32 very small sstables. This will help to prevent
the first compaction job to be working on a set of sstables that covers a wide
range.
I can provide more info if needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)