[jira] [Comment Edited] (CASSANDRA-9988) Introduce leaf-only iterator

Jason Brown (JIRA) Wed, 08 Nov 2017 16:16:52 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244848#comment-16244848
 ]


Jason Brown edited comment on CASSANDRA-9988 at 11/9/17 12:15 AM:
------------------------------------------------------------------

Overall, I think the patch is pretty solid. A couple of comments and nits:

- is {{BTreeSearchIterator}} necessary anymore? It's just an empty interface now
- just to confirm, {{FullBTreeSearchIterator}} is just a rename of the previous 
{{BTreeSearchIterator}} class? My diff'ing seems to confirm that, but I'd like 
the extra level of sanity checking.

nits:
- {{BTreeSet}} switch imports back to original format
- in {{LeafBTreeSearchIterator}}, move {{#compareToFirst}} and {{#searchNext}} 
closer to the methods from where they are called. Or better yet, since each is 
only called from one place, just inline the functionality into those calling 
methods.

My biggest problem is with the microbench. I know this ticket has gone through 
a bunch on contortions over it's life, so maybe a little confusion is 
reasonable. However, what I was really hoping to see was a comparison of the 
leaf vs full (tree) iterators, as that's what this whole ticket is about. I 
took liberty of reworking the mircobench to explicitly test the differences 
bewtween leaf and tree iterators, as well as reduce of benchmark executions 
(due to every value between 0 and 31, inclusive, being a param to the 
{{targetIdx}} field. (I simply removed about 2/3 of those values, as I think we 
only really care about testing reading from the beginning, middle, and end of 
the btree entries; anything else just bloats total test execution time). 
Further, as {{#iteratorTree()}} and {{#multiSearchFound}} don't use the 
{{targetIdx}} param, I think they can be moved to another class as they would 
be executed once for each of the entries in {{targetIdx}} (I'm not awatre of a 
way to ignore that in JMH).

My version of your branch with the updated microbench is 
[here|https://github.com/jasobrown/cassandra/tree/9988]

EDIT: forgot to mention that with the updated microbench, I was able to measure 
that the leaf iterator was anywhere from 3-20% faster, depending on the test. 
So, indeed this is a good patch


was (Author: jasobrown):
Overall, I think the patch is pretty solid. A couple of comments and nits:

- is {{BTreeSearchIterator}} necessary anymore? It's just an empty interface now
- just to confirm, {{FullBTreeSearchIterator}} is just a rename of the previous 
{{BTreeSearchIterator}} class? My diff'ing seems to confirm that, but I'd like 
the extra level of sanity checking.

nits:
- {{BTreeSet}} switch imports back to original format
- in {{LeafBTreeSearchIterator}}, move {{#compareToFirst}} and {{#searchNext}} 
closer to the methods from where they are called. Or better yet, since each is 
only called from one place, just inline the functionality into those calling 
methods.

My biggest problem is with the microbench. I know this ticket has gone through 
a bunch on contortions over it's life, so maybe a little confusion is 
reasonable. However, what I was really hoping to see was a comparison of the 
leaf vs full (tree) iterators, as that's what this whole ticket is about. I 
took liberty of reworking the mircobench to explicitly test the differences 
bewtween leaf and tree iterators, as well as reduce of benchmark executions 
(due to every value between 0 and 31, inclusive, being a param to the 
{{targetIdx}} field. (I simply removed about 2/3 of those values, as I think we 
only really care about testing reading from the beginning, middle, and end of 
the btree entries; anything else just bloats total test execution time). 
Further, as {{#iteratorTree()}} and {{#multiSearchFound}} don't use the 
{{targetIdx}} param, I think they can be moved to another class as they would 
be executed once for each of the entries in {{targetIdx}} (I'm not awatre of a 
way to ignore that in JMH).

My version of your branch with the updated microbench is 
[here|https://github.com/jasobrown/cassandra/tree/9988]

EDIT: forgot to mention that with the updated microbench, the leaf iterator was 
anywhere from 3-20% faster, depending on the test.

> Introduce leaf-only iterator
> ----------------------------
>
>                 Key: CASSANDRA-9988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Benedict
>            Assignee: Jay Zhuang
>            Priority: Minor
>              Labels: patch
>             Fix For: 4.0
>
>         Attachments: 9988-3tests.png, 9988-data.png, 9988-result.png, 
> 9988-result2.png, 9988-result3.png, 9988-test-result-expsearch.xlsx, 
> 9988-test-result-raw.png, 9988-test-result.xlsx, 9988-test-result3.png, 
> 9988-trunk-new-update.txt, 9988-trunk-new.txt, trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-9988) Introduce leaf-only iterator

Reply via email to