[ https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106073#comment-16106073 ]
Benedict commented on CASSANDRA-9988: ------------------------------------- Assuming your implementation is correct, here is a modified version that works {code} int lb = forwards ? nextPos : lowerBound; // inclusive int ub = forwards ? upperBound : nextPos; // inclusive int bound = 0; int c = -1; while (lb + bound < ub && (c = comparator.compare(keys[lb + bound], key)) < 0) bound = Math.max(bound * 2, 1); if (c == 0) { return lb + bound; } // note, by checking == 0, we do not need to include it in bsearch, so we do not need to increment ub return Arrays.binarySearch(keys, lb + (bound/2), Math.min(lb + bound, ub), key, comparator); {code} This implementation as a whole could be cleaned up a little, and there are a variety of ways to bit-twiddle to get {{bound == 0}} on first iteration, but this one is at least very clear. A "faster" variant would be: {code} while (lb + (bound / 2) < ub && (c = comparator.compare(keys[lb + (bound / 2)], key)) < 0) bound *= 2; {code} A faster-still variant would just offset lb by -1 at the start, so there are no recurring costs, and restore it after the loop. But better than this would be something like: {code} int lb = forwards ? nextPos : lowerBound; // inclusive int ub = forwards ? upperBound : nextPos; // inclusive int c = -1; int probe_ub = lb; int probe_incr = 1; while (probe_ub < ub && (c = comparator.compare(keys[probe_ub], key)) < 0) { lb = probe_ub; probe_ub += probe_incr; probe_incr *= 2; } if (c == 0) { return probe_ub; } return Arrays.binarySearch(keys, lb, probe_ub, key, comparator); {code} better still I haven't looked at the codebase in a long time, and don't have it checked out, so haven't consulted the branch iterator version, and don't actually recall precisely what algorithm we used. I don't recall it being a particularly elegant approach, though. In fact, it's possible it only compares the first element and then resorts to binary search; I do not recall for sure if we carried exponential search over from the ArrayBackedColumns (or whatever it used to be called). I will note the exponential search as it stands here probably does not behave very well for reverse iteration. Also, apologies for against-style-guide variable names. But this comment box is too poor an editor to revise it any further. > Introduce leaf-only iterator > ---------------------------- > > Key: CASSANDRA-9988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9988 > Project: Cassandra > Issue Type: Sub-task > Reporter: Benedict > Assignee: Jay Zhuang > Priority: Minor > Labels: patch > Fix For: 4.0 > > Attachments: 9988-data.png, 9988-result2.png, 9988-result3.png, > 9988-result.png, 9988-test-result3.png, 9988-test-result-expsearch.xlsx, > 9988-test-result-raw.png, 9988-test-result.xlsx, 9988-trunk-new.txt, > 9988-trunk-new-update.txt, trunk-9988.txt > > > In many cases we have small btrees, small enough to fit in a single leaf > page. In this case it _may_ be more efficient to specialise our iterator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org