Ilya Soin created FLINK-39637:
---------------------------------
Summary: Filter push-down for the savepoint table connector
Key: FLINK-39637
URL: https://issues.apache.org/jira/browse/FLINK-39637
Project: Flink
Issue Type: Improvement
Components: API / State Processor
Reporter: Ilya Soin
h3. Problem
The savepoint table connector reads keyed state from Flink savepoints via SQL.
Currently, every query - even one with a _WHERE_ clause on the primary key -
must restore and iterate all keys across all key groups. For large savepoints
with millions of keys, this makes point lookups and small-range queries
unnecessarily expensive.
h3. Proposed solution
Implement _SupportsFilterPushDown_ on the savepoint table source, enabling the
Flink SQL planner to push key predicates ({_}=, IN, BETWEEN, <, <=, >, >={_},
and combinations via {_}AND/OR{_}) directly into the savepoint scan.
Filtering would be applied at two levels:
# *Split pruning* — For equality filters ({_}WHERE k = 42 or WHERE k IN (1, 2,
3){_}), the connector would determine which key groups contain the target keys
and create input splits only for those key groups. This avoids restoring the
state backend for irrelevant portions of the savepoint entirely.
# *Key iteration pruning* — Within each input split, the key iterator would be
wrapped with a filter that skips non-matching keys before they reach the state
reader function. This applies to all supported filter types, including range
predicates where split-level pruning is not possible.
h3. Scope
* Filter push-down only works for state keys, not values;
* At first, push-down only supports primitive key types (e.g. {_}INT,
STRING{_}). Composite key support can be added in a follow-up ticket.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)