[ https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118100#comment-16118100 ]
Benedict edited comment on CASSANDRA-9988 at 8/8/17 12:27 PM: -------------------------------------------------------------- bq. The original implementation is valid. I think that's another example shows the code is not that easy to understand. Mea culpa, although I think that's more an indictment of my hubris and our mutual disinterest. If fifteen distracted minutes in a JIRA comment window procrastinating from real work yields a minor bug, it's probably not *that* complicated, and an IDE, test suite and a couple of hours would probably more than conquer it. bq. The graph is the search performance (throughput) for searching target on the different positions. Except that this isn't the same, because the search domain is different with different starting positions. This is also over-counting unrelated test setup costs. One would expect that, with a leaf size of 32, iterating the whole contents would be _approx._ 4x faster with exp. search than binary search (and 5x faster for a single search from the beginning); the fact it is only 3x is probably down to those extra costs. The worst inversion of performance would be around 2x (i.e. exp. search being 2x slower than b.search at its worst), with logarithmically spaced items - i.e. at position 16, 8, 4, 2, 1. However, in the slowest case for exp. search, we are doing much less work than the slowest for binary search. Of course, with a limit to 32 items the exp. search benefits are much less apparent. If we had a leaf size of 64 it would become much more stark. With a leaf size of 16, it would also become completely irrelevant. was (Author: benedict): bq. The original implementation is valid. I think that's another example shows the code is not that easy to understand. Mea culpa, although I think that's more an indictment of my hubris and our mutual disinterest. If fifteen distracted minutes in a JIRA comment window procrastinating from real work yields a minor bug, it's probably not *that* complicated, and an IDE, test suite and a couple of hours would probably more than conquer it. bq. The graph is the search performance (throughput) for searching target on the different positions. Except that this isn't the same, because the search domain is different with different starting positions. This is also over-counting unrelated test setup costs. One would expect that, with a leaf size of 32, iterating the whole contents would be _approx._ 4x faster with exp. search than binary search; the fact it is only 3x is probably down to those extra costs. The worst inversion of performance would be around 2x (i.e. exp. search being 2x slower than b.search at its worst), with logarithmically spaced items - i.e. at position 16, 8, 4, 2, 1. However, in the slowest case for exp. search, we are doing much less work than the slowest for binary search. Of course, with a limit to 32 items the exp. search benefits are much less apparent. If we had a leaf size of 64 it would become much more stark. With a leaf size of 16, it would also become completely irrelevant. > Introduce leaf-only iterator > ---------------------------- > > Key: CASSANDRA-9988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9988 > Project: Cassandra > Issue Type: Sub-task > Reporter: Benedict > Assignee: Jay Zhuang > Priority: Minor > Labels: patch > Fix For: 4.0 > > Attachments: 9988-3tests.png, 9988-data.png, 9988-result2.png, > 9988-result3.png, 9988-result.png, 9988-test-result3.png, > 9988-test-result-expsearch.xlsx, 9988-test-result-raw.png, > 9988-test-result.xlsx, 9988-trunk-new.txt, 9988-trunk-new-update.txt, > trunk-9988.txt > > > In many cases we have small btrees, small enough to fit in a single leaf > page. In this case it _may_ be more efficient to specialise our iterator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org