[
https://issues.apache.org/jira/browse/HUDI-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-9578:
---------------------------------
Labels: pull-request-available (was: )
> Support reader caching inside FG reader
> ---------------------------------------
>
> Key: HUDI-9578
> URL: https://issues.apache.org/jira/browse/HUDI-9578
> Project: Apache Hudi
> Issue Type: New Feature
> Reporter: Lin Liu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.1.0
>
>
> Currently the `reader reuse` feature of `HoodieBackedMetadata` is not used
> due to the integration of FG reader.
> The problem caused by this is that when there are multiple reads for the same
> MDT file slice in the same read query, Hudi has to re-open the file multiple
> times. This may cause performance regression since:
> 1. One more GET request to S3 to open the file, which brings performance and
> $ cost.
> 2. When the file is not close, the cached data blocks and meta data blocks in
> the memory are not cleared. They can be potentially rescanned.
>
> Therefore, we should implement caching feature for underlying readers used by
> FG reader for either MDT scenarios or generic DT/MDT scenarios.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)