[jira] [Commented] (CASSANDRA-9130) reduct default dtcs max_sstable_age
[ https://issues.apache.org/jira/browse/CASSANDRA-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615546#comment-14615546 ] Jonathan Shook commented on CASSANDRA-9130: --- I'm not particularly concerned about the corner cases for lots of sstables, but it does need to be documented better. We do not yet have tools to manage re-compacting DTCS past max_sstable_age_days. Even if we did, it would not be an automatic win in every case. The operational trade-offs that come with different max_sstable_age_days are simply too stark to avoid. I still believe that 365 is way too high. Studying the total bytes compacted over different DTCS settings and ingest rates can show the IO load. 365 is way beyond the point at which you start paying for more compaction than you need in most systems. I do agree, though about the boundary condition. We should have a safety in place to avoid max_sstable_age_days table TTL until we can verify that a TTL-specific compaction pass will occur as needed. This might be a concern as well for per-write TTLs. [~jjirsa] Is there a way that you would like to see the interplay between TTLs and max_sstable_age_days handled? Is there a solution which you would consider safe? reduct default dtcs max_sstable_age --- Key: CASSANDRA-9130 URL: https://issues.apache.org/jira/browse/CASSANDRA-9130 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Marcus Eriksson Priority: Minor Fix For: 3.x, 2.1.x, 2.0.x Now that CASSANDRA-9056 is fixed it should be safe to reduce the default age and increase performance correspondingly. [~jshook] suggests that two weeks may be appropriate, or we could make it dynamic based on gcgs (since that's the window past which we should expect repair to not introduce fragmentation anymore). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9130) reduct default dtcs max_sstable_age
[ https://issues.apache.org/jira/browse/CASSANDRA-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615585#comment-14615585 ] Jeff Jirsa commented on CASSANDRA-9130: --- This is also discussed in CASSANDRA-9644, but quoting [~krummas] from CASSANDRA-9666 (which, full disclosure, is my ticket about how I'd like to replace DTCS with an alternative that doesn't need max_sstable_age_days at all): {quote} - make it possible to put a cap on the size of the windows {quote} The problem is that windows grow larger as they get older (increasing the cost of re-compaction). If max_sstable_age_days is set low, and fragmentation happens - and it will happen, sooner or later, to any real production cluster - the alternatives are either raise max_sstable_age_days to allow compaction, or eat the performance/memory hit from having tens of thousands of sstables (in 2.1.5/6, this was a near-instant OOM/crash). When you raise max_sstable_age_days, you'll PROBABLY run your server out of disk, because it's going to pick the biggest sstables and recompact them in parallel, doubling your usage as if you were doing an STCS major. Yea, we all know that the recommendation in STCS days was to stay below 50% disk capacity, but reaching that limit is about the time you'll start streaming, and that's when all hell will break loose with DTCS as it exists. reduct default dtcs max_sstable_age --- Key: CASSANDRA-9130 URL: https://issues.apache.org/jira/browse/CASSANDRA-9130 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Marcus Eriksson Priority: Minor Fix For: 3.x, 2.1.x, 2.0.x Now that CASSANDRA-9056 is fixed it should be safe to reduce the default age and increase performance correspondingly. [~jshook] suggests that two weeks may be appropriate, or we could make it dynamic based on gcgs (since that's the window past which we should expect repair to not introduce fragmentation anymore). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9130) reduct default dtcs max_sstable_age
[ https://issues.apache.org/jira/browse/CASSANDRA-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615085#comment-14615085 ] Jeff Jirsa commented on CASSANDRA-9130: --- I know this sounds obvious, but I'm repeating it for the record, because it's relevant: You will still introduce fragmentation during bootstrap, bulk load, and decom, potentially thousands or tens of thousands of sstables per node for decom. Lowing max_sstable_age_days below the table TTL is not necessarily safe, with or without CASSANDRA-9056 reduct default dtcs max_sstable_age --- Key: CASSANDRA-9130 URL: https://issues.apache.org/jira/browse/CASSANDRA-9130 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Marcus Eriksson Priority: Minor Fix For: 3.x, 2.1.x, 2.0.x Now that CASSANDRA-9056 is fixed it should be safe to reduce the default age and increase performance correspondingly. [~jshook] suggests that two weeks may be appropriate, or we could make it dynamic based on gcgs (since that's the window past which we should expect repair to not introduce fragmentation anymore). -- This message was sent by Atlassian JIRA (v6.3.4#6332)