Lin Liu created HUDI-9578:
-----------------------------

             Summary: Support reader caching inside FG reader
                 Key: HUDI-9578
                 URL: https://issues.apache.org/jira/browse/HUDI-9578
             Project: Apache Hudi
          Issue Type: New Feature
            Reporter: Lin Liu
             Fix For: 1.1.0


Currently the `reader reuse` feature of `HoodieBackedMetadata` is not used due 
to the integration of FG reader.

The problem caused by this is that when there are multiple reads for the same 
MDT file slice in the same read query, Hudi has to re-open the file multiple 
times. This may cause performance regression since:
1. One more GET request to S3 to open the file, which brings performance and $ 
cost.
2. When the file is not close, the cached data blocks and meta data blocks in 
the memory are not cleared. They can be potentially rescanned.

 

Therefore, we should implement caching feature for underlying readers used by 
FG reader for either MDT scenarios or generic DT/MDT scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to