[
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152067#comment-15152067
]
Branimir Lambov commented on CASSANDRA-7019:
--------------------------------------------
Uploaded a new version here:
|[code|https://github.com/blambov/cassandra/tree/7019-tryouts-no-deserialization]|[utest|http://cassci.datastax.com/job/blambov-7019-tryouts-no-deserialization-testall/]|[dtest|http://cassci.datastax.com/job/blambov-7019-tryouts-no-deserialization-dtest/]|
Changes:
- The option is changed to an enum with three values: NONE, ROW and CELL,
controlling what level of deletions and overwrites to examine.
- ROW option is as in the previous version, but now uses an implementation of
the simple sstable iterator that only decodes tombstones and row deletions,
skipping over row content (for 3.0+ tables only). It also skips sstables that
do not have tombstones.
- CELL option also examines row content to find overwritten or deleted cells.
- The partition level deletion is now handled properly -- partially undoing
your change -- it is _removed_ if superseded by the one from the tombstone
source. The latter is also used to filter the partition content.
Minor additional changes:
- Data file references of the tombstone tables are now explicitly opened and
closed, only once.
- Fixes bug in {{hashCode}} calculation for {{BTreeRow}}, which was always
producing a different value.
- Fixes unnecessary sorting in finding table for tombstone compaction.
- Adds more tests and fixes test failures.
Performance run results:
{code}
{"provide_overlapping_tombstones":"CELL","class":"org.apache.cassandra.db.compaction.LeveledCompactionStrategy"}
CELL compactions completed in 6.364s
Operations completed in 394.591s, out of which 52.562 for ongoing NONE
background compactions
At start: 9 tables 922541625 bytes 876088 rows 423530
deleted rows 42867 tombstone markers
At end: 9 tables 853445991 bytes 810249 rows 407096
deleted rows 41779 tombstone markers
{"provide_overlapping_tombstones":"ROW","class":"org.apache.cassandra.db.compaction.LeveledCompactionStrategy"}
ROW compactions completed in 6.577s
Operations completed in 408.181s, out of which 54.373 for ongoing NONE
background compactions
At start: 9 tables 922539568 bytes 876088 rows 423530
deleted rows 42867 tombstone markers
At end: 9 tables 853446320 bytes 810249 rows 407096
deleted rows 41779 tombstone markers
{"provide_overlapping_tombstones":"NONE","class":"org.apache.cassandra.db.compaction.LeveledCompactionStrategy"}
NONE compactions completed in 6.415s
Operations completed in 402.645s, out of which 53.084 for ongoing NONE
background compactions
At start: 9 tables 922534683 bytes 876088 rows 423530
deleted rows 42867 tombstone markers
At end: 9 tables 922531607 bytes 876088 rows 423530
deleted rows 42867 tombstone markers
{"max_threshold":"32","min_threshold":"4","provide_overlapping_tombstones":"CELL","class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"}
CELL compactions completed in 10.119s
Operations completed in 527.998s, out of which 18.164 for ongoing NONE
background compactions
At start: 12 tables 1627719240 bytes 1549035 rows 551694
deleted rows 68948 tombstone markers
At end: 12 tables 853460582 bytes 835123 rows 407096
deleted rows 51964 tombstone markers
{"max_threshold":"32","min_threshold":"4","provide_overlapping_tombstones":"ROW","class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"}
ROW compactions completed in 8.299s
Operations completed in 519.072s, out of which 18.572 for ongoing NONE
background compactions
At start: 12 tables 1627702075 bytes 1549035 rows 551694
deleted rows 68948 tombstone markers
At end: 12 tables 879153760 bytes 835123 rows 407096
deleted rows 51964 tombstone markers
{"max_threshold":"32","min_threshold":"4","provide_overlapping_tombstones":"NONE","class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"}
NONE compactions completed in 9.465s
Operations completed in 509.603s, out of which 18.052 for ongoing NONE
background compactions
At start: 12 tables 1627710033 bytes 1549035 rows 551694
deleted rows 68948 tombstone markers
At end: 12 tables 1627706918 bytes 1549035 rows 551694
deleted rows 68948 tombstone markers
{code}
For size tiered ROW does most of the work in much shorter time, but there are
certain to be scenarios where CELL helps more. The run doesn't appear to be
long enough to see the effects for leveled, I'll add validation and start a
longer one this evening.
Some of your points still remain:
- I haven't been able to do a cstar_perf test yet. Working on it.
- Single-table compactions still don't have this turned on by default -- need
to test and choose CELL/ROW, also figure out if scrub/upgrade/cleanup etc
should be doing it.
> Improve tombstone compactions
> -----------------------------
>
> Key: CASSANDRA-7019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction
> Reporter: Marcus Eriksson
> Assignee: Branimir Lambov
> Labels: compaction
> Fix For: 3.x
>
>
> When there are no other compactions to do, we trigger a single-sstable
> compaction if there is more than X% droppable tombstones in the sstable.
> In this ticket we should try to include overlapping sstables in those
> compactions to be able to actually drop the tombstones. Might only be doable
> with LCS (with STCS we would probably end up including all sstables)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)