[ 
https://issues.apache.org/jira/browse/CASSANDRA-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003149#comment-14003149
 ] 

Marcus Eriksson commented on CASSANDRA-6851:
--------------------------------------------

The idea is that we can do anticompaction at the same time over several 
SSTables, if a key X is in all included files, it has been repaired in all 
those files, and can go into the same repaired output file (re-reading my 
description I see that it was not very clear, sorry about that)

I think what is needed is to pass in more sstables when creating the compaction 
scanners instead of doing Arrays.asList(sstable);

Perhaps a first step could be to always give it two sstables so that we atleast 
don't increase the number of sstables by doing anticompaction? I guess we could 
experiment a bit here and find a heuristic that is good for most cases.

> Improve anticompaction after incremental repair
> -----------------------------------------------
>
>                 Key: CASSANDRA-6851
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6851
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Marcus Eriksson
>            Assignee: Russell Alexander Spitzer
>            Priority: Minor
>              Labels: compaction, lhf
>             Fix For: 2.1.1
>
>
> After an incremental repair we iterate over all sstables and split them in 
> two parts, one containing the repaired data and one the unrepaired. We could 
> in theory double the number of sstables on a node.
> To avoid this we could make anticompaction also do a compaction, for example, 
> if we are to anticompact 10 sstables, we could anticompact those to 2.
> Note that we need to avoid creating too big sstables though, if we 
> anticompact all sstables on a node it would essentially be a major compaction.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to