[
https://issues.apache.org/jira/browse/CASSANDRA-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286481#comment-14286481
]
Tyler Hobbs commented on CASSANDRA-8635:
----------------------------------------
This definitely seems like it will help many cases, but it seems like there are
still a couple of problematic scenarios:
* What if the cold sstables overlap with each other but not with any hot
sstables?
* What if they all overlap by 75% and thus fall below the threshold?
If we go with this strategy, it seems like we still need a safety mechanism to
ensure that number of sstables never blows up, such as a max number of sstables.
After thinking about this for a while, I'm also tempted to scrap the whole
don't-compact-cold-sstables in STCS approach in favor of using DTCS. (This was
implemented prior to DTCS being proposed.) Since STCS somewhat randomly mixes
data within sstables, the only pattern that is likely to generate "cold"
sstables is when old data is read infrequently, and DTCS addresses many of
those patterns more effectively. Are there any opinions on this?
> STCS cold sstable omission does not handle overwrites without reads
> -------------------------------------------------------------------
>
> Key: CASSANDRA-8635
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8635
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Tyler Hobbs
> Assignee: Marcus Eriksson
> Priority: Critical
> Fix For: 2.1.3
>
> Attachments:
> 0001-Include-cold-sstables-in-compactions-if-they-overlap.patch
>
>
> In 2.1, STCS may omit cold SSTables from compaction (CASSANDRA-6109). If
> data is regularly overwritten or deleted (but not enough to trigger a
> single-sstable tombstone purging compaction), data size on disk may
> continuously grow if:
> * The table receives very few reads
> * The reads only touch the newest SSTables
> Basically, if the overwritten data is never read and there aren't many
> tombstones, STCS has no incentive to compact the sstables. We should take
> sstable overlap into consideration as well as coldness to address this case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)