[
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116055#comment-16116055
]
Jay Zhuang commented on CASSANDRA-9988:
---------------------------------------
Here are the test results to search target on the different index with the
following search algorithms:
1. binary search
2. exponential search
3. binary search optimized for next adjacent value
!9988-3tests.png|tests!
Here are my takes from the results:
h4. 1. binary search:
Pros:
* best on average and evenly distributed;
* Simple code, use JAVA native API.
Cons:
* not optimized to search for the adjacent value.
h4. 2. exponential search:
Pros:
* optimized to search the first and second value for {{Dir.ASC}}.
Cons:
* very bad average performance: compare to binarySearch it has only half the
throughput. {{Dir.DESC}} will have the same performance degradation and we do
use DESC for some places;
* complicate code, hard to maintain: For example, there's a bug in above
code, it should be {{while (prob_ub <= ub && (c =
comparator.compare(keys\[probe_ub], key)) < 0)}} it only impacts the tree size
{{2^n - 1}} and search for the last element, it's tricky to find out.
I think we could make the algorithm work for {{Dir.DESC}} too, but it will make
the code even more complicated.
h4. 3. binary search optimized for next adjacent value
Pros:
* Optimized to search next adjacent value, works for both {{Dir.ASC}} and
{{Dir.DESC}}
Cons:
* Average performance is a litter bit worse than binarySearch
I'm not familiar with the whole cassandra code base, so not sure how often we
search for the adjacent value. Personally, I would still prefer binarySearch:
* consistency with other cassandra code, plus it's the current behave right now
for btree search;
* smaller mean deviations, could be good for tp99;
* clean code.
But I'm fine with any solutions:
| 1. binary search |
[9988-trunk|https://github.com/cooldoger/cassandra/tree/9988-trunk] |
| 2. exponential search |
[9988-trunk-exp|https://github.com/cooldoger/cassandra/tree/9988-trunk-exp] |
| 3. binary search optimized|
[9988-trunk-x|https://github.com/cooldoger/cassandra/tree/9988-trunk-x] |
> Introduce leaf-only iterator
> ----------------------------
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Benedict
> Assignee: Jay Zhuang
> Priority: Minor
> Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-3tests.png, 9988-data.png, 9988-result2.png,
> 9988-result3.png, 9988-result.png, 9988-test-result3.png,
> 9988-test-result-expsearch.xlsx, 9988-test-result-raw.png,
> 9988-test-result.xlsx, 9988-trunk-new.txt, 9988-trunk-new-update.txt,
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf
> page. In this case it _may_ be more efficient to specialise our iterator.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]