[ 
https://issues.apache.org/jira/browse/CASSANDRA-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226614#comment-14226614
 ] 

Nikolai Grigoriev commented on CASSANDRA-8301:
----------------------------------------------

I have attempted to write a simple prototype (very ugly :) ) of such a tool. I 
am very interested in it because I do suffer from that problem. In fact, 
without such a tool I simply cannot bootstrap a node. I have tried and the node 
*never* recovers. 

So, anyway, I have tried my prototype on a freshly bootstrapped node and it 
seems to be working. Instead of initial 7,5K pending compactions I have got 
only about 600, few hours later it is down to ~450 and seems to be going down. 
cfstats also look quite good (to me ;) ):

{code}
SSTable count: 6311
SSTables in each level: [571/4, 10, 80, 1411/1000, 4239, 0, 0, 0, 0]
{code}

I do have some sstables at L0 because the node is taking normal (heavy) traffic 
at the same time. But this number is already down from ~700 original.

I think I could give it a try to make the prototype tool less ugly and submit 
it here, if you do not mind.

> Create a tool that given a bunch of sstables creates a "decent" sstable 
> leveling
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8301
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8301
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>
> In old versions of cassandra (i.e. not trunk/3.0), when bootstrapping a new 
> node, you will end up with a ton of files in L0 and it might be extremely 
> painful to get LCS to compact into a new leveling
> We could probably exploit the fact that we have many non-overlapping sstables 
> in L0, and offline-bump those sstables into higher levels. It does not need 
> to be perfect, just get the majority of the data into L1+ without creating 
> overlaps.
> So, suggestion is to create an offline tool that looks at the range each 
> sstable covers and tries to bump it as high as possible in the leveling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to