Francesco Guardiani created FLINK-25918:
-------------------------------------------

             Summary: Use FileEnumerator to implement filter pushdown of 
filepath metadata 
                 Key: FLINK-25918
                 URL: https://issues.apache.org/jira/browse/FLINK-25918
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / FileSystem
            Reporter: Francesco Guardiani


Right now, unless you configure partition keys, the table file source will 
ingest all the files in the provided {{path}}.

Which means that a query like:

{code:sql}
SELECT * FROM MyFileTable WHERE filepath LIKE "%.csv"
{code}

Will ingest all the files and then, after the records are loaded in flink, the 
filtering happens and discards all the records not coming from a file with 
pattern "%.csv".

Using the filter push down feature provided by the DynamicTableSource stack, we 
could instead provide the {{FileSourceBuilder}} directly a {{FileEnumerator}} 
that does the filtering of input files, so we can effectively skip reading them.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to