[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433288#comment-15433288 ] Marcus Eriksson commented on CASSANDRA-6216: nice catch, committed, thanks > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Assignee: Dikang Gu >Priority: Minor > Labels: compaction, lcs > Fix For: 3.0.9, 3.10 > > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433157#comment-15433157 ] Dikang Gu commented on CASSANDRA-6216: -- Thanks [~krummas]! It's probably due to merge, I think you need to remove the calculatedLastCompactedKeys check as well. See my patch: https://github.com/DikangGu/cassandra/commit/2d1f3871b7ac76e0b7d0ff8e47aa69cda013b0dc > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Assignee: Dikang Gu >Priority: Minor > Labels: compaction, lcs > Fix For: 3.0.9, 3.10 > > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432238#comment-15432238 ] Marcus Eriksson commented on CASSANDRA-6216: this lgtm with a small change - we should call super.startup() after calculating the last compacted keys - otherwise (i think) we could start compactions before it has been calculated I did that and rebased on 3.0 [here|https://github.com/krummas/cassandra/commits/dikang/6216] tests: https://cassci.datastax.com/view/Dev/view/krummas/job/krummas-dikang-6216-testall/ https://cassci.datastax.com/view/Dev/view/krummas/job/krummas-dikang-6216-dtest/ will commit when tests have finished successfully > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Assignee: Dikang Gu >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431816#comment-15431816 ] Dikang Gu commented on CASSANDRA-6216: -- [~krummas], thanks for the review! Addressed your comments and here are the latest patch, https://github.com/DikangGu/cassandra/commit/928a9ea5621c1405c5ac07f4dc26c616206277cf. I also manually tested it, seems fine. > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Assignee: Dikang Gu >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430121#comment-15430121 ] Marcus Eriksson commented on CASSANDRA-6216: This looks good to me, just one small comment Instead of calculating it in LeveledManifest#getCompactionCandidates - I think we could override AbstractCompactionStrategy#startup in LeveledCompactionStrategy - at this point we should know all the sstables and can do the calculation > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Assignee: Dikang Gu >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429283#comment-15429283 ] Dikang Gu commented on CASSANDRA-6216: -- [~krummas] cool, I think it's a more easy-to-understand idea, and maybe could save a lot of code. Here is the patch https://github.com/DikangGu/cassandra/commit/a63c8583ada93187b177c68dbc4443c29717e44e, can you please take a look? I'm not sure how to write the unit test at this moment, will try some manual testing. > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429243#comment-15429243 ] Marcus Eriksson commented on CASSANDRA-6216: If we want to start compaction in level n, maybe we could find the newest (by creation or modification time) file in level n+1 and use its last token when finding the start point in level n? > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429086#comment-15429086 ] Dikang Gu commented on CASSANDRA-6216: -- I'd like to work on this, I'll see if I can store the last compacted keys in CFMetaData, sounds a correct direction? > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428769#comment-15428769 ] sankalp kohli commented on CASSANDRA-6216: -- I did not fix it..not sure if someone else has fixed it > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428713#comment-15428713 ] Wei Deng commented on CASSANDRA-6216: - Isn't this already implemented so this should be closed with a proper fixVer number? > Level Compaction should persist last compacted key per level > > > Key: CASSANDRA-6216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 > Project: Cassandra > Issue Type: Sub-task >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Minor > Labels: compaction, lcs > Attachments: JIRA-6216.diff > > > Level compaction does not persist the last compacted key per level. This is > important for higher levels. > The sstables with higher token and in higher levels wont get a chance to > compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850971#comment-13850971 ] Jonathan Ellis commented on CASSANDRA-6216: --- We should be going by size-on-disk, not sstable count. They do tend to correspond but it's common to get leftover sstables smaller than the max, which can be meaningful as max gets larger. We can also get slightly more sophisticated and count bytes that would be written after accounting for expired tombstones. (See {{findDroppableSSTable}} for example of using tombstone stats.) Also, this is actually backwards -- we want the *least* overlapping, since that means we spend less time rewriting data that doesn't change (less write amplification). See improved compaction here: http://hackingdistributed.com/2013/06/17/hyperleveldb/, although I think they did overcomplicate the solution (as I mentioned in my comment on that page). Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: JIRA-6216.diff Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850982#comment-13850982 ] Jonathan Ellis commented on CASSANDRA-6216: --- That is, we should prefer smallest write amplification where wa = (overlapped size-minus-tombstones + candidate size-minus-tombstones) / candidate size-minus-tombstones Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: JIRA-6216.diff Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804863#comment-13804863 ] Jonathan Ellis commented on CASSANDRA-6216: --- bq. should we not pick those sstables which have the largest overlapping with sstables in the higer level. +1. IIRC HyperDex does this in their leveldb fork. bq. this might cause starvation to some sstables as they wont get a chance to compact. Well, if they're worse candidates, I'm okay with starving them. :) (We will force compaction of an sstable that has too many expired tombstones already, so tombstones living forever shouldn't be a problem.) Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798675#comment-13798675 ] sankalp kohli commented on CASSANDRA-6216: -- Instead of sorting all sstables in a level by token and iterating over them, should we not pick those sstables which have the largest overlapping with sstables in the higer level. Though this might cause starvation to some sstables as they wont get a chance to compact. May be a mix of both will work better. Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Priority: Minor Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1#6144)