Ying Lin created HUDI-4314:
------------------------------

             Summary: Improve the performance of reading from the specified 
instant when the Flink streaming read application starts
                 Key: HUDI-4314
                 URL: https://issues.apache.org/jira/browse/HUDI-4314
             Project: Apache Hudi
          Issue Type: Improvement
          Components: flink
            Reporter: Ying Lin


When a Flink streaming reading application starts, it starts reading from the 
specified instant (or resumes the instant when it was stopped).

We need to filter out the file paths that does not exist, some files may be 
cleaned by the cleaner.

The current implementation is to do an _exists_ operation on all files, so an 
optimized way is to only do an _exists_ operatiion for lastest version files.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to