[ 
https://issues.apache.org/jira/browse/HBASE-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964156#comment-15964156
 ] 

Vikas Vishwakarma commented on HBASE-17877:
-------------------------------------------

I completed few iterations today, I added the guava benchmark also which is the 
latest version, with the following changes
1. Have all the comparable benchmarks in JMH as suggested by [~stack]
2. Added BlackHole.consume for all the benchmark results as suggested by 
[~Apache9] (thanks again !)
3. Used a slightly optimized random byte generation to make sure it's impact is 
less on the benchmarks by using a smaller byte array for random byte selection 
and replacement in input arrays 
4. Added the guava benchmarks (master branch) as suggested by [~larsh] above

Observations
While hadoop version was giving better performance, the performance was ~10% 
lower when byteArrayLength%8  is not 0, most likely because of the last loop 
where it iterates over each of the leftover bytes. This also could be some 
compiler optimization. If I switch the leftover byte handling in hadoop 
comparator similar to HBase, I get exactly inverse result i.e worse when 
byteArraySize%8 = 0 and better when byteArraySize%8 != 0
However with guava version I am able to see overall better performance compared 
to both HBase as well as Hadoop. I had tried the guava version earlier also 
using native timestamp before/after kind of measurements and in that case 
hadoop comparator was giving better results in some cases. It could again be 
statistical variations or related to compiler optimizations, etc. With JMH 
framework after fixing all the initial issues related to compiler/JIT 
optimization like input byte array randomization, adding BlackHole, etc I am 
seeing consistently better benchmarks for the guava version for all array sizes 
including the one's where byteArraySize%8 is zero or non-zero

Looks like the main branch guava version is best performing and replacing that 
in HBase should give maximum gains (pending review)
https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362

Results:
|---|HBase|---|---|hadoop|---|---|hadoop %diff|---|---|guava|---|---|guava 
%diff|---|---|
|byte array 
size|min|mean|max|min|mean|max|min|mean|max|min|mean|max|min|mean|max|
|4|19814.642|20217.647|20250.91|19838.782|20072.437|20090.503|0|-1|-1|24026.12|24284.021|24300.338|21|20|20|
|8|19846.598|19874.477|19881.019|22012.932|22044.713|22051.793|11|11|11|22199.453|22253.173|22261.712|12|12|12|
|16|19400.623|19430.837|19438.378|19606.912|19616.322|19649.318|1|1|1|21995.475|22113.443|22120.836|13|14|14|
|20|18456.241|18493.416|18500.289|16482.859|16705.744|16776.35|-11|-10|-9|18625.111|18660.355|18704.285|1|1|1|
|32|18953.196|18984.412|18992.993|19307.22|19345.122|19352.411|2|2|2|21309.337|21359.051|21377.868|12|13|13|
|50|17444.431|17506.91|17518.791|15864.759|15941.543|15953.412|-9|-9|-9|18468.621|18613.202|18749.651|6|6|7|
|64|17390.097|18046.898|18143.835|20152.624|20379.32|20397.359|16|13|12|21065.113|21116.799|21128.523|21|17|16|
|100|14844.718|14866.353|14889.49|13293.668|13385.7|13403.439|-10|-10|-10|15594.286|15690.369|15796.081|5|6|6|
|128|14183.991|14329.948|14351.016|17016.59|17260.48|17278.799|20|20|20|17668.509|19205.199|19333.922|25|34|35|
|200|11665.597|11732.09|11748.27|11599.469|11733.228|11755.622|-1|0|0|14540.79|14648.077|14728.363|25|25|25|
|256|10404.438|10438.019|10444.734|13205.591|13315.903|13326.772|27|28|28|14448.858|14933.008|15064.242|39|43|44|
|512|6405.106|6592.613|6604.371|9031.652|9142.564|9149.54|41|39|39|10236.501|10376.17|10389.971|60|57|57|
|1024|3812.341|3832.237|3840.291|3863.105|3864.757|3871.94|1|1|1|6911.951|7002.067|7009.792|81|83|83|
|2048|2052.148|2060.585|2061.935|2129.32|2151.807|2155.381|4|4|5|4072.481|4085.278|4089.185|98|98|98|
|4096|1073.263|1089.947|1091.566|1069.962|1076.303|1076.993|0|-1|-1|2319.74|2326.514|2328.69|116|113|113|
|8192|544.723|547.063|547.449|863.716|866.808|867.296|59|58|58|931.945|1131.406|1136.288|71|107|108|
|16384|275.155|275.724|275.909|432.556|434.158|434.698|57|57|58|582.37|584.294|584.852|112|112|112|

Apologies for the multiple iterations, I am myself figuring out a lot while 
doing these microbenchmark iterations and there are multiple dimensions to 
track in the test at different levels

> Replace/improve HBase's byte[] comparator
> -----------------------------------------
>
>                 Key: HBASE-17877
>                 URL: https://issues.apache.org/jira/browse/HBASE-17877
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Vikas Vishwakarma
>         Attachments: 17877-1.2.patch, 17877-v2-1.3.patch, 
> ByteComparatorJiraHBASE-17877.pdf
>
>
> [~vik.karma] did some extensive tests and found that Hadoop's version is 
> faster - dramatically faster in some cases.
> Patch forthcoming.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to