[ 
https://issues.apache.org/jira/browse/HBASE-30150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-30150.
----------------------------------
    Fix Version/s: 4.0.0-alpha-1
                   2.7.0
                   3.0.0-beta-2
                   2.6.6
     Hadoop Flags: Reviewed
       Resolution: Fixed

> Propagate filter hints through composite filters
> ------------------------------------------------
>
>                 Key: HBASE-30150
>                 URL: https://issues.apache.org/jira/browse/HBASE-30150
>             Project: HBase
>          Issue Type: Improvement
>          Components: Filters, Scanners
>            Reporter: Shubham Roy
>            Assignee: Shubham Roy
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.6
>
>
> h3. Context
>   HBASE-29974 introduced two new Filter API methods — 
> {{getHintForRejectedRow(Cell)}} and {{getSkipHint(Cell)}} — that allow 
> filters to provide seek hints
>   when rows are rejected by {{filterRowKey}} or when cells are structurally 
> skipped before {{filterCell}} is reached (time-range gates, column-set
>   exclusion, version-limit exhaustion). These methods are correctly delegated 
> through {{FilterWrapper}}, but the composite filter wrappers do not propagate
>   them.
>   h3. Problem
>   {{FilterListWithAND}}, {{FilterListWithOR}}, {{SkipFilter}}, and 
> {{WhileMatchFilter}} do not override or delegate {{getHintForRejectedRow}} or
>   {{getSkipHint}}. They inherit the no-op default from {{FilterBase}} which 
> returns {{null}}. This means:
>   * A filter graph like {{FilterList(AND, MultiRowRangeFilter, 
> ColumnPrefixFilter)}} will silently ignore any hint provided by sub-filters.
>   * Almost all real-world HBase filter configurations use {{FilterList}} to 
> compose filters. Until this JIRA is resolved, the hint optimization from
>   HBASE-29974 only benefits standalone (non-composed) filter usage.
>   * For CDC\/replication use cases that combine filters with AND (e.g., a 
> skip-scan filter combined with a time-range-aware filter), the seek hint path 
> is
>   effectively dead code.
>   This was explicitly documented as a limitation in the Javadoc of both new 
> methods in HBASE-29974 and deferred to this follow-up JIRA.
>   h3. Scope
>   The following classes need to override {{getHintForRejectedRow}} and 
> {{getSkipHint}} with appropriate composition semantics:
>   * *{{FilterListWithAND}}* — all sub-filters must agree on row rejection 
> before a hint is meaningful. When multiple sub-filters provide hints, the 
> composed
>    hint should be the most conservative (furthest forward for forward scans, 
> furthest backward for reversed scans) to avoid skipping rows that another
>   sub-filter would have accepted.
>   * *{{FilterListWithOR}}* — any sub-filter rejecting a row may provide a 
> hint, but the composed hint must be the least aggressive (closest to current
>   position) since other sub-filters may still accept intermediate rows.
>   * *{{SkipFilter}}* — should delegate to the wrapped filter if the wrapped 
> filter provides a hint.
>   * *{{WhileMatchFilter}}* — should delegate to the wrapped filter if the 
> wrapped filter provides a hint.
>   h3. Key Design Considerations
>   * *Hint composition for AND semantics:* when sub-filter A hints to row-X 
> and sub-filter B hints to row-Y, the AND-list should use {{max(row-X, row-Y)}}
>   for forward scans and {{min(row-X, row-Y)}} for reversed scans — the 
> furthest hint is safe because ALL filters must accept.
>   * *Hint composition for OR semantics:* the OR-list should use {{min(row-X, 
> row-Y)}} for forward scans and {{max(row-X, row-Y)}} for reversed scans — the
>   closest hint is required because ANY filter accepting means the row should 
> not be skipped.
>   * *Null handling:* if any sub-filter returns {{null}} (no hint), the 
> composed result depends on the operator. For AND, null from one filter means 
> "no
>   opinion" — the other hint can still be used. For OR, null from one filter 
> means "no shortcut available" — the entire composition must fall back to
>   {{null}}.
>   * *{{getSkipHint}} statelessness contract:* the composition must respect 
> the contract that {{getSkipHint}} implementations must not modify filter 
> state.
>   The composite override should call sub-filters' {{getSkipHint}} and compose 
> results without side effects.
>   * *Reversed scan direction:* hint composition must be direction-aware, 
> consistent with the contracts documented in HBASE-29974.
>   h3. Test Plan
>   * Unit tests for {{FilterListWithAND}} and {{FilterListWithOR}} hint 
> composition — single hint provider, multiple hint providers, mixed 
> null\/non-null,
>   forward and reversed scans
>   * Unit tests for {{SkipFilter}} and {{WhileMatchFilter}} delegation
>   * Integration tests with composed filter graphs exercising the hint path 
> end-to-end (e.g., {{FilterList(AND, hintFilter, noHintFilter)}})
>   * Regression tests ensuring existing {{FilterList}} behavior is unchanged 
> when no sub-filter overrides the new methods
>   h3. References
>   * Parent JIRA: 
> [HBASE-29974|https://issues.apache.org/jira/browse/HBASE-29974]
>   * Master PR: [apache/hbase#7882|https://github.com/apache/hbase/pull/7882]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to