JinHyuk Kim created HBASE-29889:
-----------------------------------

             Summary: Add XXH3 Hash Support to Bloom Filter
                 Key: HBASE-29889
                 URL: https://issues.apache.org/jira/browse/HBASE-29889
             Project: HBase
          Issue Type: New Feature
          Components: regionserver
            Reporter: JinHyuk Kim
            Assignee: JinHyuk Kim


h2. Summary

Added *XXH3* as a new hashing option for the HBase Bloom Filter.
h2. Background

Existing hash functions used in HBase Bloom Filters(Jenkins, Murmur and 
Murmur3) were designed years ago and do not fully leverage modern CPU 
architectures.

[*XXH3*|https://github.com/Cyan4973/xxHash], on the other hand, is optimized 
for today’s CPUs with wide execution units and fast unaligned memory access, 
resulting in significantly faster hashing performance.
h2. What Was Done
 * Implemented XXH3 Hashing and integrated it as an available hash type for 
Bloom Filters.
 * Conducted benchmark tests comparing XXH3 with existing hash algorithms.
 ** Benchmark test code is available in 
[jinhyukify/xxh3-benchmark.|https://github.com/jinhyukify/xxh3-benchmark]
 * *Benchmark Results:*
 ** 
[https://docs.google.com/document/d/1LycZZMKFrrxYytEnzVj-EjQB4PbmmTgprhMOpDRPqYM/edit?usp=sharing]

h2. Expected Impact
 * *Faster Bloom filter lookups* across all Bloom types during client-side read 
paths.

 * *Slight improvement in Bloom filter write performance* during HFile creation 
and compaction, thanks to the lower hashing overhead of XXH3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to