[ 
https://issues.apache.org/jira/browse/ACCUMULO-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510679#comment-15510679
 ] 

Will Murnane commented on ACCUMULO-4468:
----------------------------------------

bq. Do you have other examples of where this might be used in a tight loop?
I think there are lots of other examples in Accumulo itself; for example, in 
iterators.system.DeletingIterator and iterators.user.TransformingIterator, this 
method is called in a loop. In our use case, we're doing a similar thing: 
building a larger object out of multiple rows, by finding groups of rows which 
are equal under ROW_COLFAM. When each object is built from only a few rows, the 
CF equality comparison returns false pretty often (which is to be expected), 
but only after comparing row IDs, which are always the same in practice.

bq. How did you test this? What types of numbers did you see?
I haven't been able to install it on a cluster to test. The test suite does 
pass with this patch applied. I think it's a minor change; in the "rows are 
equal" case the same amount of work is done as with the existing code, although 
the parts are accessed in the opposite order. They're still compared 
mostly-in-order, as isEqual does, but the comment in that function was 
inspiration to try reversing the comparison order.

Aside from performance, the code seems cleaner to me: there's no more 
repetition of e.g. the check of row equality. The bytecode (with Oracle javac 
1.8.0_92) is substantially smaller: 389 bytes versus 167.

bq. Why not consolidate this to:
That's fine with me. I just wrote all the cases to look the same, instead of 
having a "special case" for the last comparison made. If some special work were 
required in the compare-equal case, it could go before the return statement.

> accumulo.core.data.Key.equals(Key, PartialKey) improvement
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-4468
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4468
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.8.0
>            Reporter: Will Murnane
>            Priority: Trivial
>              Labels: newbie, performance
>         Attachments: key_comparison.patch
>
>
> In the Key.equals(Key, PartialKey) overload, the current method compares 
> starting at the beginning of the key, and works its way toward the end. This 
> functions correctly, of course, but one of the typical uses of this method is 
> to compare adjacent rows to break them into larger chunks. For example, 
> accumulo.core.iterators.Combiner repeatedly calls this method with subsequent 
> pairs of keys.
> I have a patch which reverses the comparison order. That is, if the method is 
> called with ROW_COLFAM_COLQUAL_COLVIS, it will compare visibility, cq, cf, 
> and finally row. This (marginally) improves the speed of comparisons in the 
> relatively common case where only the last part is changing, with less 
> complex code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to