[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226708#comment-14226708
 ] 

Marcus Eriksson commented on CASSANDRA-8301:
--------------------------------------------

cool, what is your heuristic for finding the level?

I thought a bit about it and figured that we could probably estimate level by 
ordering sstables by the number of other sstables they overlap, then putting 
the ones that overlap the most in the lowest levels

ie, an sstable in L1 is bound to overlap ~10 in L2, 100 in L3 etc, meaning it 
would overlap 110 sstables if we only have 3 levels, an sstable in L2 would 
overlap 10 in L3 and only one in L1, total 11, and sstables in the top level 
would only overlap one in L2 and one in L1. This assumes L0 was empty when 
bootstrapping which is most often wrong and I haven't given much thought on how 
to fix that

> Create a tool that given a bunch of sstables creates a "decent" sstable 
> leveling
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8301
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>
> In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
> node, you will end up with a ton of files in L0 and it might be extremely 
> painful to get LCS to compact into a new leveling
> We could probably exploit the fact that we have many non-overlapping sstables 
> in L0, and offline-bump those sstables into higher levels. It does not need 
> to be perfect, just get the majority of the data into L1+ without creating 
> overlaps.
> So, suggestion is to create an offline tool that looks at the range each 
> sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to