[jira] [Comment Edited] (CASSANDRA-9130) reduct default dtcs max_sstable_age

Jeff Jirsa (JIRA) Mon, 06 Jul 2015 13:41:49 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615585#comment-14615585
 ]


Jeff Jirsa edited comment on CASSANDRA-9130 at 7/6/15 8:40 PM:
---------------------------------------------------------------

This is also discussed in CASSANDRA-9644, but quoting [~krummas] from 
CASSANDRA-9666 (which, full disclosure, is my ticket about how I'd like to 
replace DTCS with an alternative that doesn't need max_sstable_age_days at all):

{quote}
- make it possible to put a cap on the size of the windows
{quote}

The problem is that windows grow larger as they get older (increasing the cost 
of re-compaction). If max_sstable_age_days is set low, and  fragmentation 
happens - and it will happen, sooner or later, to any real production cluster - 
the alternatives are either raise max_sstable_age_days to allow compaction, or 
eat the performance/memory hit from having tens of thousands of sstables (in 
2.1.5/6, this was a near-instant OOM/crash). 

When you raise max_sstable_age_days, you'll not only have to re-compact in the 
smaller files, but you'll also re-compact neighboring files in the larger 
windows which were shielded by max_sstable_age_days, causing 1-min_threshold 
larger sstables to result, which probably wasn't the intention of the operator 
who had configured the lower max_sstable_age_days



was (Author: jjirsa):
This is also discussed in CASSANDRA-9644, but quoting [~krummas] from 
CASSANDRA-9666 (which, full disclosure, is my ticket about how I'd like to 
replace DTCS with an alternative that doesn't need max_sstable_age_days at all):

{quote}
- make it possible to put a cap on the size of the windows
{quote}

The problem is that windows grow larger as they get older (increasing the cost 
of re-compaction). If max_sstable_age_days is set low, and  fragmentation 
happens - and it will happen, sooner or later, to any real production cluster - 
the alternatives are either raise max_sstable_age_days to allow compaction, or 
eat the performance/memory hit from having tens of thousands of sstables (in 
2.1.5/6, this was a near-instant OOM/crash). When you raise 
max_sstable_age_days, you'll PROBABLY run your server out of disk, because it's 
going to pick the biggest sstables and recompact them in parallel (and likely 
using a sub-optimal selection strategy - CASSANDRA-9597 - fixable, but as of 
July 2015, still a problem), more than doubling your usage as if you were doing 
an STCS major.  Yea, we all know that the recommendation in STCS days was to 
stay below 50% disk capacity, but reaching that limit is about the time you'll 
start streaming, and that's when all hell will break loose with DTCS as it 
exists. 

> reduct default dtcs max_sstable_age
> -----------------------------------
>
>                 Key: CASSANDRA-9130
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9130
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Marcus Eriksson
>            Priority: Minor
>             Fix For: 3.x, 2.1.x, 2.0.x
>
>
> Now that CASSANDRA-9056 is fixed it should be safe to reduce the default age 
> and increase performance correspondingly.  [~jshook] suggests that two weeks 
> may be appropriate, or we could make it dynamic based on gcgs (since that's 
> the window past which we should expect repair to not introduce fragmentation 
> anymore).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9130) reduct default dtcs max_sstable_age

Reply via email to