[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850971#comment-13850971
 ] 

Jonathan Ellis commented on CASSANDRA-6216:
-------------------------------------------

We should be going by size-on-disk, not sstable count.  They do tend to 
correspond but it's common to get "leftover" sstables smaller than the max, 
which can be meaningful as max gets larger.  We can also get slightly more 
sophisticated and count "bytes that would be written after accounting for 
expired tombstones."  (See {{findDroppableSSTable}} for example of using 
tombstone stats.)

Also, this is actually backwards -- we want the *least* overlapping, since that 
means we spend less time rewriting data that doesn't change (less write 
amplification).  See "improved compaction" here: 
http://hackingdistributed.com/2013/06/17/hyperleveldb/, although I think they 
did overcomplicate the solution (as I mentioned in my comment on that page).

> Level Compaction should persist last compacted key per level
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-6216
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: sankalp kohli
>            Assignee: sankalp kohli
>            Priority: Minor
>         Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.    



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to