yihua commented on code in PR #13724:
URL: https://github.com/apache/hudi/pull/13724#discussion_r2335702528


##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReaderImpl.java:
##########
@@ -334,11 +342,11 @@ private boolean isAtFirstKeyOfBlock(BlockIndexEntry 
indexEntry) {
    *                        afterward
    * @param numEntries      the number of entries in the root index block
    * @param levels          the level of the indexes
-   * @return
+   * @return single/multiple-level data block index
    */
   private TreeMap<Key, BlockIndexEntry> readDataBlockIndex(HFileBlockReader 
rootBlockReader,
-                                                           int numEntries,
-                                                           int levels) throws 
IOException {
+      int numEntries,
+      int levels) throws IOException {

Review Comment:
   Similar on the indentation



##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -527,6 +527,33 @@ public final class HoodieMetadataConfig extends 
HoodieConfig {
       .withDocumentation("Max size in MB below which metadata file (HFile) 
will be downloaded "
           + "and cached entirely for the HFileReader.");
 
+  public static final ConfigProperty<Boolean> 
METADATA_HFILE_BLOCK_CACHE_ENABLED = ConfigProperty
+      .key(METADATA_PREFIX + ".hfile.block.cache.enabled")
+      .defaultValue(false)
+      .markAdvanced()
+      .sinceVersion("1.1.0")
+      .withDocumentation("Enable HFile block-level caching for metadata files. 
This caches frequently "
+          + "accessed HFile blocks in memory to reduce I/O operations during 
metadata queries. "
+          + "Improves performance for workloads with repeated metadata access 
patterns.");
+
+  public static final ConfigProperty<Integer> METADATA_HFILE_BLOCK_CACHE_SIZE 
= ConfigProperty
+      .key(METADATA_PREFIX + ".hfile.block.cache.size")
+      .defaultValue(100)
+      .markAdvanced()
+      .sinceVersion("1.1.0")
+      .withDocumentation("Maximum number of HFile blocks to cache in memory 
per metadata file reader. "
+          + "Higher values improve cache hit rates but consume more memory. "
+          + "Only effective when hfile.block.cache.enabled is true.");
+
+  public static final ConfigProperty<Integer> 
METADATA_HFILE_BLOCK_CACHE_TTL_MINUTES = ConfigProperty
+      .key(METADATA_PREFIX + ".hfile.block.cache.ttl.minutes")
+      .defaultValue(60)
+      .markAdvanced()
+      .sinceVersion("1.1.0")
+      .withDocumentation("Time-to-live (TTL) in minutes for cached HFile 
blocks. Blocks are evicted "
+          + "from the cache after this duration to prevent memory leaks. "
+          + "Only effective when hfile.block.cache.enabled is true.");

Review Comment:
   The `HoodieMetadataConfig` holds mostly write configs.  Should we just use 
`HFileReaderConfig` and the same configuration keys for controlling the HFile 
block cache behavior (or define the `ConfigProperty` in `HoodieReaderConfig` 
and use the same config keys)?



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReaderImpl.java:
##########
@@ -281,7 +282,7 @@ Map<Key, BlockIndexEntry> getDataBlockIndexMap() {
    * @throws IOException upon error.
    */
   private static HFileTrailer readTrailer(SeekableDataInputStream stream,
-                                          long fileSize) throws IOException {
+      long fileSize) throws IOException {

Review Comment:
   nit: indentation does not follow the checkstyle



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReaderImpl.java:
##########
@@ -395,4 +403,4 @@ private TreeMap<Key, BlockIndexEntry> 
readDataBlockIndex(HFileBlockReader rootBl
     // (6) Returns the combined index entry map
     return blockIndexEntryMap;
   }
-}
+}

Review Comment:
   nit: add newline.  Check other files too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to