[
https://issues.apache.org/jira/browse/HBASE-29889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HBASE-29889:
-----------------------------------
Labels: pull-request-available (was: )
> Add XXH3 Hash Support to Bloom Filter
> -------------------------------------
>
> Key: HBASE-29889
> URL: https://issues.apache.org/jira/browse/HBASE-29889
> Project: HBase
> Issue Type: New Feature
> Components: regionserver
> Reporter: JinHyuk Kim
> Assignee: JinHyuk Kim
> Priority: Major
> Labels: pull-request-available
>
> h2. Summary
> Added *XXH3* as a new hashing option for the HBase Bloom Filter.
> h2. Background
> Existing hash functions used in HBase Bloom Filters(Jenkins, Murmur and
> Murmur3) were designed years ago and do not fully leverage modern CPU
> architectures.
> [*XXH3*|https://github.com/Cyan4973/xxHash], on the other hand, is optimized
> for today’s CPUs with wide execution units and fast unaligned memory access,
> resulting in significantly faster hashing performance.
> h2. What Was Done
> * Implemented XXH3 Hashing and integrated it as an available hash type for
> Bloom Filters.
> * Conducted benchmark tests comparing XXH3 with existing hash algorithms.
> ** Benchmark test code is available in
> [jinhyukify/xxh3-benchmark.|https://github.com/jinhyukify/xxh3-benchmark]
> * *Benchmark Results:*
> **
> [https://docs.google.com/document/d/1LycZZMKFrrxYytEnzVj-EjQB4PbmmTgprhMOpDRPqYM/edit?usp=sharing]
> h2. Expected Impact
> * *Faster Bloom filter lookups* across all Bloom types during client-side
> read paths.
> * *Slight improvement in Bloom filter write performance* during HFile
> creation and compaction, thanks to the lower hashing overhead of XXH3.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)