[
https://issues.apache.org/jira/browse/CASSANDRA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stu Hood updated CASSANDRA-607:
-------------------------------
Attachment: 607-compaction-iterator-predicate.diff
As I mentioned in IRC, rather than using read repair for smaller differences
between nodes, I wanted to make anti-compaction more efficient for the use case
of small ranges, and continue to optimize that repair path.
The attached patch adds a Predicate that CompactionIterator will apply to the
key of each input IteratingRow. Rows that aren't interesting are not added to
the set to be reduced. When an IteratingRow is not read, the backing Scanner
will seek to the end of the row, so we save a ton of deserialization overhead.
In very quick tests, this patch reduced the time to anti-compact out 0.3% of
data from 96secs to 30secs. A next step might be to modify CompactionIterator
to apply the Predicate to the SSTable index, and only scan certain ranges.
> implement repair-via-rangecommand
> ---------------------------------
>
> Key: CASSANDRA-607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-607
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Stu Hood
> Fix For: 0.5
>
> Attachments: 607-compaction-iterator-predicate.diff
>
>
> this is more lightweight than repair-via-streaming and can be used for
> smaller repairs
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.