[ 
https://issues.apache.org/jira/browse/CASSANDRA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-607:
-------------------------------

    Attachment: 607-compaction-iterator-predicate.diff

As I mentioned in IRC, rather than using read repair for smaller differences 
between nodes, I wanted to make anti-compaction more efficient for the use case 
of small ranges, and continue to optimize that repair path.

The attached patch adds a Predicate that CompactionIterator will apply to the 
key of each input IteratingRow. Rows that aren't interesting are not added to 
the set to be reduced. When an IteratingRow is not read, the backing Scanner 
will seek to the end of the row, so we save a ton of deserialization overhead.

In very quick tests, this patch reduced the time to anti-compact out 0.3% of 
data from 96secs to 30secs. A next step might be to modify CompactionIterator 
to apply the Predicate to the SSTable index, and only scan certain ranges.

> implement repair-via-rangecommand
> ---------------------------------
>
>                 Key: CASSANDRA-607
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-607
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>         Attachments: 607-compaction-iterator-predicate.diff
>
>
> this is more lightweight than repair-via-streaming and can be used for 
> smaller repairs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to