jbewing opened a new pull request, #5354:
URL: https://github.com/apache/hbase/pull/5354

   ### What
   
   This PR updates `ByteBufferUtils#findCommonPrefix` and 
`Bytes#findCommonPrefix` to compare 8 bytes from the input buffers/arrays if 
`Unsafe` access is available. On platforms where Unsafe is unavailable, we use 
the current implementations. This is a similar optimization as to what is 
already done with `ByteBufferUtils#compareToUnsafe`.
   
   ### Implementation Notes
   There was a `Bytes#findCommonPrefix` method and a 
`ByteBufferUtils#findCommonPrefix` method that both accepted `byte[]` args. 
I've updated the `ByteBufferUtils#findCommonPrefix` method to delegate to 
`Bytes#findCommonPrefix` and applied the optimization for 8 byte at a time 
comparison to the `Bytes` class. 
   
   Overall, the implementation draws a ton of inspiration from 
`ByteBufferUtils#compareToUnsafe`. The only large change that I made is for how 
we handle mismatches in the big endian case. I used the number of leading zeros 
intrinsic there instead of the number of trailing zeros intrinsic to find which 
byte was mismatched. 
   
   ### Testing
   I've added some unit tests to cover testing the path with unsafe enabled and 
disabled.
   ### Benchmarking
   I haven't done any micro-benchmarking of the new "faster" implementations 
vs. the current implementations. I'll update the JIRA with a link to those when 
I get a chance to write them. For now, I'm assuming that this method of finding 
common prefixes is faster than the current one based off the previous 
micro-benchmarking results for `compareTo` (as this is very similar code).
   
   [HBASE-28025](https://issues.apache.org/jira/browse/HBASE-28025)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to