[
https://issues.apache.org/jira/browse/CASSANDRA-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435388#comment-15435388
]
Wei Deng commented on CASSANDRA-7272:
-------------------------------------
I've done some experiments regarding this feature on latest trunk (which will
be 3.10 branch) and here are a few observations that were not clear from
reading the comments in this JIRA, so I'm documenting them here in case anybody
else may find useful later:
1. Once major compaction is triggered, it's going to include all existing
SSTables from *all* levels regardless of where they're, so you don't have to
run the offline {{sstablelevelreset}} first to manually send all SSTables to
L0. If you watch debug.log, you will see a massive line of log entry like
"CompactionTask.java:153 - Compacting xxxxx" that prints out all SSTables and
their respective levels.
2. This major compaction is now carried out by one CompactionExecutor, so it
will likely consume all CPU bandwidth of a single CPU core. This means the
faster your CPU single-core performance is, the sooner you can get the major
compaction done. However, there is no other parallelism whatsoever, as the same
compaction thread will have to spit out all SSTables in sequence so that they
can be on non-overlapping token ranges.
3. Before you start a major compaction, normally you want to run "nodetool
disableautocompaction <ks> <cf> && nodetool stop COMPACTION" so that no minor
compaction is running for this table. When the table is set to not performing
auto compaction, you can still start major compaction afterwards.
4. After you trigger the major compaction, you can still use "nodetool stop
COMPACTION" to stop the running major compaction from continuing. And because
of CASSANDRA-7066, it will revert all SSTables back to the previous status,
which was nice.
5. Until the major compaction is completely finished, you won't see "nodetool
cfstats" or debug.log entries to properly reflect the intermediate results
(i.e. none of the newly created SSTables will be counted by cfstats or printed
by debug.log) so there could be a major discrepancy in the total SSTable count
and the number of SSTables in each level.
6. When the major compaction finishes, you may see a debug log entry showing
that all newly generated SSTables are now compacted to L0, which is misleading,
because there's actually an additional step taken by the major compaction to
arrange them into different L1+ levels that is not reflected in the debug.log.
It will follow the plan "write 10 files in L1, 100 files in L2 etc, starting
from the lowest token (meaning L1 will not overlap at all with L2)". Since it's
a simple metadata change (e.g. updating *-Statistics.db component) it can
happen very quickly. Whether this approach will cause major write amplification
and performance problem afterwards as mentioned by [~JiriHorky] still remains
to be seen, and I believe some more tests will need to be done to prove it.
> Add "Major" Compaction to LCS
> ------------------------------
>
> Key: CASSANDRA-7272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7272
> Project: Cassandra
> Issue Type: Improvement
> Reporter: T Jake Luciani
> Assignee: Marcus Eriksson
> Priority: Minor
> Labels: compaction, docs-impacting, lcs
> Fix For: 2.2.0 beta 1
>
>
> LCS has a number of minor issues (maybe major depending on your perspective).
> LCS is primarily used for wide rows so for instance when you repair data in
> LCS you end up with a copy of an entire repaired row in L0. Over time if you
> repair you end up with multiple copies of a row in L0 - L5. This can make
> predicting disk usage confusing.
> Another issue is cleaning up tombstoned data. If a tombstone lives in level
> 1 and data for the cell lives in level 5 the data will not be reclaimed from
> disk until the tombstone reaches level 5.
> I propose we add a "major" compaction for LCS that forces consolidation of
> data to level 5 to address these.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)