[
https://issues.apache.org/jira/browse/HUDI-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-4314:
---------------------------------
Labels: pull-request-available (was: )
> Improve the performance of reading from the specified instant when the Flink
> streaming read application starts
> --------------------------------------------------------------------------------------------------------------
>
> Key: HUDI-4314
> URL: https://issues.apache.org/jira/browse/HUDI-4314
> Project: Apache Hudi
> Issue Type: Improvement
> Components: flink
> Reporter: Ying Lin
> Priority: Major
> Labels: pull-request-available
>
> When a Flink streaming reading application starts, it starts reading from the
> specified instant (or resumes the instant when it was stopped).
> We need to filter out the file paths that does not exist, some files may be
> cleaned by the cleaner.
> The current implementation is to do an _exists_ operation on all files, so an
> optimized way is to only do an _exists_ operatiion for lastest version files.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)