[
https://issues.apache.org/jira/browse/FLINK-39988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18091449#comment-18091449
]
Yibo Cao commented on FLINK-39988:
----------------------------------
Hi, I would like to take a look at this issue.
I have studied FLINK-39637 and understand that the current savepoint filter
push-down path supports primitive key predicates such as `WHERE k = 42`.
Since this would be my first contribution to Flink, I would like to start with
a limited and incremental scope: supporting equality push-down for ROW
composite keys only, for example `WHERE k = ROW('tenant-a', 42)`.
I understand this may not cover the full scope of FLINK-39988, as it would not
include POJO, Java record, Tuple, Avro, nested composite types, or range
predicates in the first patch. My intention is to provide a smaller first step
that can be reviewed and validated, and then continue expanding the support if
this direction is accepted.
Does this limited scope sound reasonable as an initial patch for this issue?
> Support composite types in filter push-down for savepoint Table API connector
> -----------------------------------------------------------------------------
>
> Key: FLINK-39988
> URL: https://issues.apache.org/jira/browse/FLINK-39988
> Project: Flink
> Issue Type: Improvement
> Components: API / State Processor
> Affects Versions: 2.4.0
> Reporter: Ilya Soin
> Priority: Minor
>
> FLINK-39637 added predicate push-down when querying state using the
> _savepoint_ connector. However, it only works with primitive types, such as
> {_}string{_}, {_}long{_}, etc. It can be improved by adding support for
> composite types:
> * POJOs
> * Java records
> * Flink Tuple
> * Flink Row
> * Avro records
> The general idea is the following:
> * user describes key column as _ROW(field1, field2, ..., fieldN)_ in the DDL
> + provides {_}value-class{_}, if needed
> * _SavepointFilterTranslator_ knows state key type and knows the filtering
> which user is applying, e.g. _WHERE k =_ _ROW("Bob", 20)._ It can construct
> an object _obj_ of key type, filling its fields with data supplied by user in
> the {_}ROW("Bob", 20){_}{_}.{_} Then it can take _obj.hashcode()_ and derive
> the exact _InputSplit_ of the key (same approach we do for simple types). We
> can then scan only relevant splits.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)