[ 
https://issues.apache.org/jira/browse/CASSANDRA-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405773#comment-15405773
 ] 

Marcus Eriksson commented on CASSANDRA-10643:
---------------------------------------------

* Should we cancel ongoing compactions before running this? I'm thinking it 
might be hard to get the sstables in a non-compacting state otherwise. Could 
probably select and mark sstables using 
{{ColumnFamilyStore#runWithCompactionsDisabled}}?
* In {{CompactionManager}} - {{submitOnSSTables}} and {{submitTask}} could be 
private and we should probably reuse these methods in other places in that 
class (if not, those new methods should probably be folded up in a single 
method)
* In {{ColumnFamilyStore#sstablesInBounds}}:
** use a {{Set<SSTR>}} instead of a {{List<SSTR>}} for the sstables to return - 
otherwise we can get the same sstable multiple times.
** use {{View.sstablesInBounds}} instead of {{data.getView().sstablesInBounds}}
** {{sstablesInBounds(tokenRange.left.minKeyBound(), 
tokenRange.right.maxKeyBound(), tree)}} - we should not use a Range<> parameter 
to this method if we want to look up sstables between minKeyBound -> 
maxKeyBound since {{Range}} is (left, right] - should probably use {{Bounds}} 
which is [left, right] instead

> Implement compaction for a specific token range
> -----------------------------------------------
>
>                 Key: CASSANDRA-10643
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10643
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Vishy Kasar
>            Assignee: Vishy Kasar
>              Labels: lcs
>         Attachments: 10643-trunk-REV01.txt, 10643-trunk-REV02.txt
>
>
> We see repeated cases in production (using LCS) where small number of users 
> generate a large number repeated updates or tombstones. Reading data of such 
> users brings in large amounts of data in to java process. Apart from the read 
> itself being slow for the user, the excessive GC affects other users as well. 
> Our solution so far is to move from LCS to SCS and back. This takes long and 
> is an over kill if the number of outliers is small. For such cases, we can 
> implement the point compaction of a token range. We make the nodetool compact 
> take a starting and ending token range and compact all the SSTables that fall 
> with in that range. We can refuse to compact if the number of sstables is 
> beyond a max_limit.
> Example: 
> nodetool -st 3948291562518219268 -et 3948291562518219269 compact keyspace 
> table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to