[jira] [Comment Edited] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a "decent" sstable leveling

Nikolai Grigoriev (JIRA) Wed, 26 Nov 2014 12:06:14 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226744#comment-14226744
 ]


Nikolai Grigoriev edited comment on CASSANDRA-8301 at 11/26/14 8:04 PM:
------------------------------------------------------------------------

The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the original node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?


was (Author: [email protected]):
The logic I have built is very simple. And probably has some fundamental flaws 
:)

First I calculate the target size for each level (in bytes) to accommodate all 
my data - i.e. to distribute the total size of all my sstables. This also gives 
me the maximum level to target. Then I take all sstables for the given CF, sort 
them by the beginning (left) of their bounds. Then I start from the highest 
level (L4 in my example) and iterate over that list of sstables. I grab the 
first sstable, remember its bounds, put it to the current level. Then skip to 
the next one that does not intersect with these bounds, assign it to the 
current level and change the bounds. And so on until the end of the list or 
until I use all available size. Then I move to the lower level and repeat it on 
the remaining sstables. And so on. The remainder goes to L0 where overlaps are 
allowed (right?).

I had to also come up with some logic to exclude the sstables that cover large 
range of tokens. Most likely these are the ones that were recently written at 
L0 on the source node - they cover whatever was recently written into them, 
right? I ignore those from my logic and leave them for L0.

Or did I get it completely wrong?

> Create a tool that given a bunch of sstables creates a "decent" sstable 
> leveling
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8301
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>
> In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
> node, you will end up with a ton of files in L0 and it might be extremely 
> painful to get LCS to compact into a new leveling
> We could probably exploit the fact that we have many non-overlapping sstables 
> in L0, and offline-bump those sstables into higher levels. It does not need 
> to be perfect, just get the majority of the data into L1+ without creating 
> overlaps.
> So, suggestion is to create an offline tool that looks at the range each 
> sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8301) Create a tool that given a bunch of sstables creates a "decent" sstable leveling

Reply via email to