[ 
https://issues.apache.org/jira/browse/HUDI-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-4314:
---------------------------------
    Labels: pull-request-available  (was: )

> Improve the performance of reading from the specified instant when the Flink 
> streaming read application starts
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-4314
>                 URL: https://issues.apache.org/jira/browse/HUDI-4314
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: flink
>            Reporter: Ying Lin
>            Priority: Major
>              Labels: pull-request-available
>
> When a Flink streaming reading application starts, it starts reading from the 
> specified instant (or resumes the instant when it was stopped).
> We need to filter out the file paths that does not exist, some files may be 
> cleaned by the cleaner.
> The current implementation is to do an _exists_ operation on all files, so an 
> optimized way is to only do an _exists_ operatiion for lastest version files.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to