Ying Lin created HUDI-4314:
------------------------------
Summary: Improve the performance of reading from the specified
instant when the Flink streaming read application starts
Key: HUDI-4314
URL: https://issues.apache.org/jira/browse/HUDI-4314
Project: Apache Hudi
Issue Type: Improvement
Components: flink
Reporter: Ying Lin
When a Flink streaming reading application starts, it starts reading from the
specified instant (or resumes the instant when it was stopped).
We need to filter out the file paths that does not exist, some files may be
cleaned by the cleaner.
The current implementation is to do an _exists_ operation on all files, so an
optimized way is to only do an _exists_ operatiion for lastest version files.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)