[
https://issues.apache.org/jira/browse/CASSANDRA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis updated CASSANDRA-6933:
--------------------------------------
Attachment: 6933-v3.txt
I agree that in the best case this is a good optimization, I'm just not
convinced that real-world use cases are going to much resemble the best case.
In particular, in CollationController the container will be guaranteed to only
have columns the filter is looking for, so we expect to have a lot of
sequential "runs" of matches when compaction is working well. On the other
hand, once we've found "most" matches and are looking for the last handful,
there's no particular reason to expect that these last ones will be evenly
distributed across the container space. (Sure, they will be "on average," but
the variance is high enough to make that useless as a guideline.)
v3 removes the range heuristic and fixes incrementing i on a hit.
> Optimise Read Comparison Costs in collectTimeOrderedData
> --------------------------------------------------------
>
> Key: CASSANDRA-6933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6933
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Priority: Minor
> Labels: performance
> Fix For: 2.1
>
> Attachments: 6933-v3.txt
>
>
> Introduce a new SearchIterator construct, which can be obtained from a
> ColumnFamily, which permits efficiently iterating a subset of the cells in
> ascending order. Essentially, it saves the previously visited position and
> searches from there, but also tries to avoid searching the whole remaining
> space if possible.
--
This message was sent by Atlassian JIRA
(v6.2#6252)