[
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106073#comment-16106073
]
Benedict commented on CASSANDRA-9988:
-------------------------------------
Assuming your implementation is correct, here is a modified version that works
{code}
int lb = forwards ? nextPos : lowerBound; // inclusive
int ub = forwards ? upperBound : nextPos; // inclusive
int bound = 0;
int c = -1;
while (lb + bound < ub && (c = comparator.compare(keys[lb + bound],
key)) < 0)
bound = Math.max(bound * 2, 1);
if (c == 0) { return lb + bound; } // note, by checking == 0, we do
not need to include it in bsearch, so we do not need to increment ub
return Arrays.binarySearch(keys, lb + (bound/2), Math.min(lb + bound,
ub), key, comparator);
{code}
This implementation as a whole could be cleaned up a little, and there are a
variety of ways to bit-twiddle to get {{bound == 0}} on first iteration, but
this one is at least very clear. A "faster" variant would be:
{code}
while (lb + (bound / 2) < ub && (c = comparator.compare(keys[lb +
(bound / 2)], key)) < 0)
bound *= 2;
{code}
A faster-still variant would just offset lb by -1 at the start, so there are no
recurring costs, and restore it after the loop. But better than this would be
something like:
{code}
int lb = forwards ? nextPos : lowerBound; // inclusive
int ub = forwards ? upperBound : nextPos; // inclusive
int c = -1;
int probe_ub = lb;
int probe_incr = 1;
while (probe_ub < ub && (c = comparator.compare(keys[probe_ub], key))
< 0) {
lb = probe_ub;
probe_ub += probe_incr;
probe_incr *= 2;
}
if (c == 0) { return probe_ub; }
return Arrays.binarySearch(keys, lb, probe_ub, key, comparator);
{code}
better still
I haven't looked at the codebase in a long time, and don't have it checked out,
so haven't consulted the branch iterator version, and don't actually recall
precisely what algorithm we used. I don't recall it being a particularly
elegant approach, though. In fact, it's possible it only compares the first
element and then resorts to binary search; I do not recall for sure if we
carried exponential search over from the ArrayBackedColumns (or whatever it
used to be called).
I will note the exponential search as it stands here probably does not behave
very well for reverse iteration.
Also, apologies for against-style-guide variable names. But this comment box
is too poor an editor to revise it any further.
> Introduce leaf-only iterator
> ----------------------------
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Benedict
> Assignee: Jay Zhuang
> Priority: Minor
> Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-data.png, 9988-result2.png, 9988-result3.png,
> 9988-result.png, 9988-test-result3.png, 9988-test-result-expsearch.xlsx,
> 9988-test-result-raw.png, 9988-test-result.xlsx, 9988-trunk-new.txt,
> 9988-trunk-new-update.txt, trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf
> page. In this case it _may_ be more efficient to specialise our iterator.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]