Shubham Roy created HBASE-30150:
-----------------------------------
Summary: [HBASE-29974] Propagate filter hints through composite
filters
Key: HBASE-30150
URL: https://issues.apache.org/jira/browse/HBASE-30150
Project: HBase
Issue Type: Improvement
Components: Filters, Scanners
Reporter: Shubham Roy
Assignee: Shubham Roy
h3. Context
HBASE-29974 introduced two new Filter API methods —
{{getHintForRejectedRow(Cell)}} and {{getSkipHint(Cell)}} — that allow filters
to provide seek hints
when rows are rejected by {{filterRowKey}} or when cells are structurally
skipped before {{filterCell}} is reached (time-range gates, column-set
exclusion, version-limit exhaustion). These methods are correctly delegated
through {{FilterWrapper}}, but the composite filter wrappers do not propagate
them.
h3. Problem
{{FilterListWithAND}}, {{FilterListWithOR}}, {{SkipFilter}}, and
{{WhileMatchFilter}} do not override or delegate {{getHintForRejectedRow}} or
{{getSkipHint}}. They inherit the no-op default from {{FilterBase}} which
returns {{null}}. This means:
* A filter graph like {{FilterList(AND, MultiRowRangeFilter,
ColumnPrefixFilter)}} will silently ignore any hint provided by sub-filters.
* Almost all real-world HBase filter configurations use {{FilterList}} to
compose filters. Until this JIRA is resolved, the hint optimization from
HBASE-29974 only benefits standalone (non-composed) filter usage.
* For CDC\/replication use cases that combine filters with AND (e.g., a
skip-scan filter combined with a time-range-aware filter), the seek hint path is
effectively dead code.
This was explicitly documented as a limitation in the Javadoc of both new
methods in HBASE-29974 and deferred to this follow-up JIRA.
h3. Scope
The following classes need to override {{getHintForRejectedRow}} and
{{getSkipHint}} with appropriate composition semantics:
* *{{FilterListWithAND}}* — all sub-filters must agree on row rejection
before a hint is meaningful. When multiple sub-filters provide hints, the
composed
hint should be the most conservative (furthest forward for forward scans,
furthest backward for reversed scans) to avoid skipping rows that another
sub-filter would have accepted.
* *{{FilterListWithOR}}* — any sub-filter rejecting a row may provide a hint,
but the composed hint must be the least aggressive (closest to current
position) since other sub-filters may still accept intermediate rows.
* *{{SkipFilter}}* — should delegate to the wrapped filter if the wrapped
filter provides a hint.
* *{{WhileMatchFilter}}* — should delegate to the wrapped filter if the
wrapped filter provides a hint.
h3. Key Design Considerations
* *Hint composition for AND semantics:* when sub-filter A hints to row-X and
sub-filter B hints to row-Y, the AND-list should use {{max(row-X, row-Y)}}
for forward scans and {{min(row-X, row-Y)}} for reversed scans — the furthest
hint is safe because ALL filters must accept.
* *Hint composition for OR semantics:* the OR-list should use {{min(row-X,
row-Y)}} for forward scans and {{max(row-X, row-Y)}} for reversed scans — the
closest hint is required because ANY filter accepting means the row should
not be skipped.
* *Null handling:* if any sub-filter returns {{null}} (no hint), the composed
result depends on the operator. For AND, null from one filter means "no
opinion" — the other hint can still be used. For OR, null from one filter
means "no shortcut available" — the entire composition must fall back to
{{null}}.
* *{{getSkipHint}} statelessness contract:* the composition must respect the
contract that {{getSkipHint}} implementations must not modify filter state.
The composite override should call sub-filters' {{getSkipHint}} and compose
results without side effects.
* *Reversed scan direction:* hint composition must be direction-aware,
consistent with the contracts documented in HBASE-29974.
h3. Test Plan
* Unit tests for {{FilterListWithAND}} and {{FilterListWithOR}} hint
composition — single hint provider, multiple hint providers, mixed
null\/non-null,
forward and reversed scans
* Unit tests for {{SkipFilter}} and {{WhileMatchFilter}} delegation
* Integration tests with composed filter graphs exercising the hint path
end-to-end (e.g., {{FilterList(AND, hintFilter, noHintFilter)}})
* Regression tests ensuring existing {{FilterList}} behavior is unchanged
when no sub-filter overrides the new methods
h3. References
* Parent JIRA: [HBASE-29974|https://issues.apache.org/jira/browse/HBASE-29974]
* Master PR: [apache/hbase#7882|https://github.com/apache/hbase/pull/7882]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)