[
https://issues.apache.org/jira/browse/ACCUMULO-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511418#comment-15511418
]
Keith Turner commented on ACCUMULO-4468:
----------------------------------------
[~wmurnane] I like the change. I wrote the comment and made the changes in Key
that you referenced. I remember doing the performance testing for that. When
I first experimented with the change, I tried comparing the byte arrays in
reveres order. That was much slower than comparing forward. So I avoiding that
and found another strategy that worked well. I think the concept is sound, but
the performance testing is definitely needed to make sure it works as intended.
I also like the switch statement fall through, I think its slick. If it
doesn't exists, It would be nice to add a unit test that checks for correctness
for keys that only differ by one field. Basically ensure that each of the key
field comparisons is tested.
> accumulo.core.data.Key.equals(Key, PartialKey) improvement
> ----------------------------------------------------------
>
> Key: ACCUMULO-4468
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4468
> Project: Accumulo
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.8.0
> Reporter: Will Murnane
> Priority: Trivial
> Labels: newbie, performance
> Attachments: benchmark.tar.gz, key_comparison.patch
>
>
> In the Key.equals(Key, PartialKey) overload, the current method compares
> starting at the beginning of the key, and works its way toward the end. This
> functions correctly, of course, but one of the typical uses of this method is
> to compare adjacent rows to break them into larger chunks. For example,
> accumulo.core.iterators.Combiner repeatedly calls this method with subsequent
> pairs of keys.
> I have a patch which reverses the comparison order. That is, if the method is
> called with ROW_COLFAM_COLQUAL_COLVIS, it will compare visibility, cq, cf,
> and finally row. This (marginally) improves the speed of comparisons in the
> relatively common case where only the last part is changing, with less
> complex code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)