[
https://issues.apache.org/jira/browse/HBASE-29842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaehui Lee updated HBASE-29842:
-------------------------------
Description:
This patch implements the Ribbon Filter proposed in HBASE-27266.
h2. Summary
Ribbon Filter is a space-efficient alternative to Bloom Filter, achieving
approximately ~30% space savings while maintaining comparable query performance.
- Bloom Filter requires ~9.6 bits/key for 1% FPR (44% overhead vs theoretical
minimum)
- Ribbon Filter achieves ~7.3 bits/key for 1% FPR (~10% overhead)
h2. New BloomType Options
- {{{}RIBBON_ROW{}}}: Row-based Ribbon filter (alternative to {{{}ROW{}}})
- {{{}RIBBON_ROWCOL{}}}: Row+Column-based Ribbon filter (alternative to
{{{}ROWCOL{}}})
h3. Usage Example
*HBase Shell:*
{code:java}
create 'mytable', {NAME => 'cf', BLOOMFILTER => 'RIBBON_ROW'}
alter 'mytable', {NAME => 'cf', BLOOMFILTER => 'RIBBON_ROWCOL'}
{code}
*Java API:*
{code:java}
ColumnFamilyDescriptor cfd = ColumnFamilyDescriptorBuilder
.newBuilder(Bytes.toBytes("cf"))
.setBloomFilterType(BloomType.RIBBON_ROW)
.build();
{code}
h2. References
- [Ribbon filter: practically smaller than Bloom and
Xor|https://arxiv.org/abs/2103.02515]
- [Design
Docs|https://github.com/apache/hbase/blob/5298fd5951448b8b88ad29cb20819f34c19830e1/dev-support/design-docs/HBASE-29842%20Ribbon%20Filter%20Design.md]
was:
This patch implements the Ribbon Filter proposed in HBASE-27266.
h2. Summary
Ribbon Filter is a space-efficient alternative to Bloom Filter, achieving
approximately ~30% space savings while maintaining comparable query performance.
- Bloom Filter requires ~9.6 bits/key for 1% FPR (44% overhead vs theoretical
minimum)
- Ribbon Filter achieves ~7.3 bits/key for 1% FPR (~10% overhead)
h2. New BloomType Options
- {{{}RIBBON_ROW{}}}: Row-based Ribbon filter (alternative to {{{}ROW{}}})
- {{{}RIBBON_ROWCOL{}}}: Row+Column-based Ribbon filter (alternative to
{{{}ROWCOL{}}})
h3. Usage Example
*HBase Shell:*
{code:java}
create 'mytable', {NAME => 'cf', BLOOMFILTER => 'RIBBON_ROW'}
alter 'mytable', {NAME => 'cf', BLOOMFILTER => 'RIBBON_ROWCOL'}
{code}
*Java API:*
{code:java}
ColumnFamilyDescriptor cfd = ColumnFamilyDescriptorBuilder
.newBuilder(Bytes.toBytes("cf"))
.setBloomFilterType(BloomType.RIBBON_ROW)
.build();
{code}
h2. References
- Paper: [Ribbon filter: practically smaller than Bloom and
Xor|https://arxiv.org/abs/2103.02515]
- Design document:
> Add Ribbon Filter as an alternative to Bloom Filter
> ---------------------------------------------------
>
> Key: HBASE-29842
> URL: https://issues.apache.org/jira/browse/HBASE-29842
> Project: HBase
> Issue Type: New Feature
> Reporter: Jaehui Lee
> Assignee: Jaehui Lee
> Priority: Major
> Labels: pull-request-available
>
> This patch implements the Ribbon Filter proposed in HBASE-27266.
> h2. Summary
> Ribbon Filter is a space-efficient alternative to Bloom Filter, achieving
> approximately ~30% space savings while maintaining comparable query
> performance.
> - Bloom Filter requires ~9.6 bits/key for 1% FPR (44% overhead vs
> theoretical minimum)
> - Ribbon Filter achieves ~7.3 bits/key for 1% FPR (~10% overhead)
> h2. New BloomType Options
> - {{{}RIBBON_ROW{}}}: Row-based Ribbon filter (alternative to {{{}ROW{}}})
> - {{{}RIBBON_ROWCOL{}}}: Row+Column-based Ribbon filter (alternative to
> {{{}ROWCOL{}}})
> h3. Usage Example
> *HBase Shell:*
> {code:java}
> create 'mytable', {NAME => 'cf', BLOOMFILTER => 'RIBBON_ROW'}
>
>
>
>
> alter 'mytable', {NAME => 'cf', BLOOMFILTER => 'RIBBON_ROWCOL'}
>
>
>
>
> {code}
> *Java API:*
> {code:java}
> ColumnFamilyDescriptor cfd = ColumnFamilyDescriptorBuilder
>
>
>
>
> .newBuilder(Bytes.toBytes("cf"))
>
>
>
>
> .setBloomFilterType(BloomType.RIBBON_ROW)
>
>
>
>
> .build();
>
>
>
>
> {code}
> h2. References
> - [Ribbon filter: practically smaller than Bloom and
> Xor|https://arxiv.org/abs/2103.02515]
> - [Design
> Docs|https://github.com/apache/hbase/blob/5298fd5951448b8b88ad29cb20819f34c19830e1/dev-support/design-docs/HBASE-29842%20Ribbon%20Filter%20Design.md]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
