Francesco Guardiani created FLINK-25918:
-------------------------------------------
Summary: Use FileEnumerator to implement filter pushdown of
filepath metadata
Key: FLINK-25918
URL: https://issues.apache.org/jira/browse/FLINK-25918
Project: Flink
Issue Type: Improvement
Components: Connectors / FileSystem
Reporter: Francesco Guardiani
Right now, unless you configure partition keys, the table file source will
ingest all the files in the provided {{path}}.
Which means that a query like:
{code:sql}
SELECT * FROM MyFileTable WHERE filepath LIKE "%.csv"
{code}
Will ingest all the files and then, after the records are loaded in flink, the
filtering happens and discards all the records not coming from a file with
pattern "%.csv".
Using the filter push down feature provided by the DynamicTableSource stack, we
could instead provide the {{FileSourceBuilder}} directly a {{FileEnumerator}}
that does the filtering of input files, so we can effectively skip reading them.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)