[jira] [Updated] (CASSANDRA-6933) Optimise Read Comparison Costs in collectTimeOrderedData

Jonathan Ellis (JIRA) Wed, 02 Apr 2014 14:20:18 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jonathan Ellis updated CASSANDRA-6933:
--------------------------------------

    Attachment: 6933-v3.txt

I agree that in the best case this is a good optimization, I'm just not 
convinced that real-world use cases are going to much resemble the best case.  
In particular, in CollationController the container will be guaranteed to only 
have columns the filter is looking for, so we expect to have a lot of 
sequential "runs" of matches when compaction is working well.  On the other 
hand, once we've found "most" matches and are looking for the last handful, 
there's no particular reason to expect that these last ones will be evenly 
distributed across the container space.  (Sure, they will be "on average," but 
the variance is high enough to make that useless as a guideline.)

v3 removes the range heuristic and fixes incrementing i on a hit.

> Optimise Read Comparison Costs in collectTimeOrderedData
> --------------------------------------------------------
>
>                 Key: CASSANDRA-6933
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6933
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1
>
>         Attachments: 6933-v3.txt
>
>
> Introduce a new SearchIterator construct, which can be obtained from a 
> ColumnFamily, which permits efficiently iterating a subset of the cells in 
> ascending order. Essentially, it saves the previously visited position and 
> searches from there, but also tries to avoid searching the whole remaining 
> space if possible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6933) Optimise Read Comparison Costs in collectTimeOrderedData

Reply via email to