[ 
https://issues.apache.org/jira/browse/HUDI-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-9578:
---------------------------------
    Labels: pull-request-available  (was: )

> Support reader caching inside FG reader
> ---------------------------------------
>
>                 Key: HUDI-9578
>                 URL: https://issues.apache.org/jira/browse/HUDI-9578
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Lin Liu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.1.0
>
>
> Currently the `reader reuse` feature of `HoodieBackedMetadata` is not used 
> due to the integration of FG reader.
> The problem caused by this is that when there are multiple reads for the same 
> MDT file slice in the same read query, Hudi has to re-open the file multiple 
> times. This may cause performance regression since:
> 1. One more GET request to S3 to open the file, which brings performance and 
> $ cost.
> 2. When the file is not close, the cached data blocks and meta data blocks in 
> the memory are not cleared. They can be potentially rescanned.
>  
> Therefore, we should implement caching feature for underlying readers used by 
> FG reader for either MDT scenarios or generic DT/MDT scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to