[
https://issues.apache.org/jira/browse/HBASE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935878#comment-13935878
]
Manukranth Kolloju commented on HBASE-4012:
-------------------------------------------
I know that this Jira was long closed. I stumbled upon this and realized that
we don't have it in 0.89-fb. Is it possible to have a regression in some cases.
I tried to benchmark some scenarios where the comparisions are done on byte
arrays with common prefixes. I found that as we the
Prefix : 50, gain : 30.949085592296317
Prefix : 45, gain : 29.72626255629741
Prefix : 40, gain : 26.434751984314396
Prefix : 35, gain : 24.259677810557402
Prefix : 30, gain : 22.717778496708313
Prefix : 25, gain : 14.16316408302645
Prefix : 20, gain : 8.768630290777425
Prefix : 15, gain : 1.2726417069570561
Prefix : 10, gain : -17.46894837698785
Prefix : 5, gain : -34.9383386755005
Prefix : 0, gain : -117.8493223109523
In most of the cases we will have considerable amount of data in each row key,
So, the KVComparator.compareRows will mostly return 0, so this should help the
cause. But did anyone see a regression by switching to this?
> Further optimize byte comparison methods
> ----------------------------------------
>
> Key: HBASE-4012
> URL: https://issues.apache.org/jira/browse/HBASE-4012
> Project: HBase
> Issue Type: Improvement
> Components: util
> Affects Versions: 0.92.0
> Reporter: Todd Lipcon
> Assignee: Ted Yu
> Priority: Minor
> Labels: noob
> Fix For: 0.92.0
>
> Attachments: 4012-v2.txt, 4012.txt
>
>
> Guava uses some clever tricks with sun.misc.Unsafe to compare byte arrays
> about 100% faster than the naive byte-by-byte implementation:
> http://guava-libraries.googlecode.com/svn/trunk/guava/src/com/google/common/primitives/UnsignedBytes.java
> We should borrow this [Apache 2 licensed] code.
--
This message was sent by Atlassian JIRA
(v6.2#6252)