[ https://issues.apache.org/jira/browse/HBASE-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964156#comment-15964156 ]
Vikas Vishwakarma commented on HBASE-17877: ------------------------------------------- I completed few iterations today, I added the guava benchmark also which is the latest version, with the following changes 1. Have all the comparable benchmarks in JMH as suggested by [~stack] 2. Added BlackHole.consume for all the benchmark results as suggested by [~Apache9] (thanks again !) 3. Used a slightly optimized random byte generation to make sure it's impact is less on the benchmarks by using a smaller byte array for random byte selection and replacement in input arrays 4. Added the guava benchmarks (master branch) as suggested by [~larsh] above Observations While hadoop version was giving better performance, the performance was ~10% lower when byteArrayLength%8 is not 0, most likely because of the last loop where it iterates over each of the leftover bytes. This also could be some compiler optimization. If I switch the leftover byte handling in hadoop comparator similar to HBase, I get exactly inverse result i.e worse when byteArraySize%8 = 0 and better when byteArraySize%8 != 0 However with guava version I am able to see overall better performance compared to both HBase as well as Hadoop. I had tried the guava version earlier also using native timestamp before/after kind of measurements and in that case hadoop comparator was giving better results in some cases. It could again be statistical variations or related to compiler optimizations, etc. With JMH framework after fixing all the initial issues related to compiler/JIT optimization like input byte array randomization, adding BlackHole, etc I am seeing consistently better benchmarks for the guava version for all array sizes including the one's where byteArraySize%8 is zero or non-zero Looks like the main branch guava version is best performing and replacing that in HBase should give maximum gains (pending review) https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362 Results: |---|HBase|---|---|hadoop|---|---|hadoop %diff|---|---|guava|---|---|guava %diff|---|---| |byte array size|min|mean|max|min|mean|max|min|mean|max|min|mean|max|min|mean|max| |4|19814.642|20217.647|20250.91|19838.782|20072.437|20090.503|0|-1|-1|24026.12|24284.021|24300.338|21|20|20| |8|19846.598|19874.477|19881.019|22012.932|22044.713|22051.793|11|11|11|22199.453|22253.173|22261.712|12|12|12| |16|19400.623|19430.837|19438.378|19606.912|19616.322|19649.318|1|1|1|21995.475|22113.443|22120.836|13|14|14| |20|18456.241|18493.416|18500.289|16482.859|16705.744|16776.35|-11|-10|-9|18625.111|18660.355|18704.285|1|1|1| |32|18953.196|18984.412|18992.993|19307.22|19345.122|19352.411|2|2|2|21309.337|21359.051|21377.868|12|13|13| |50|17444.431|17506.91|17518.791|15864.759|15941.543|15953.412|-9|-9|-9|18468.621|18613.202|18749.651|6|6|7| |64|17390.097|18046.898|18143.835|20152.624|20379.32|20397.359|16|13|12|21065.113|21116.799|21128.523|21|17|16| |100|14844.718|14866.353|14889.49|13293.668|13385.7|13403.439|-10|-10|-10|15594.286|15690.369|15796.081|5|6|6| |128|14183.991|14329.948|14351.016|17016.59|17260.48|17278.799|20|20|20|17668.509|19205.199|19333.922|25|34|35| |200|11665.597|11732.09|11748.27|11599.469|11733.228|11755.622|-1|0|0|14540.79|14648.077|14728.363|25|25|25| |256|10404.438|10438.019|10444.734|13205.591|13315.903|13326.772|27|28|28|14448.858|14933.008|15064.242|39|43|44| |512|6405.106|6592.613|6604.371|9031.652|9142.564|9149.54|41|39|39|10236.501|10376.17|10389.971|60|57|57| |1024|3812.341|3832.237|3840.291|3863.105|3864.757|3871.94|1|1|1|6911.951|7002.067|7009.792|81|83|83| |2048|2052.148|2060.585|2061.935|2129.32|2151.807|2155.381|4|4|5|4072.481|4085.278|4089.185|98|98|98| |4096|1073.263|1089.947|1091.566|1069.962|1076.303|1076.993|0|-1|-1|2319.74|2326.514|2328.69|116|113|113| |8192|544.723|547.063|547.449|863.716|866.808|867.296|59|58|58|931.945|1131.406|1136.288|71|107|108| |16384|275.155|275.724|275.909|432.556|434.158|434.698|57|57|58|582.37|584.294|584.852|112|112|112| Apologies for the multiple iterations, I am myself figuring out a lot while doing these microbenchmark iterations and there are multiple dimensions to track in the test at different levels > Replace/improve HBase's byte[] comparator > ----------------------------------------- > > Key: HBASE-17877 > URL: https://issues.apache.org/jira/browse/HBASE-17877 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Assignee: Vikas Vishwakarma > Attachments: 17877-1.2.patch, 17877-v2-1.3.patch, > ByteComparatorJiraHBASE-17877.pdf > > > [~vik.karma] did some extensive tests and found that Hadoop's version is > faster - dramatically faster in some cases. > Patch forthcoming. -- This message was sent by Atlassian JIRA (v6.3.15#6346)