yihua commented on code in PR #13724:
URL: https://github.com/apache/hudi/pull/13724#discussion_r2335702528
##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReaderImpl.java:
##########
@@ -334,11 +342,11 @@ private boolean isAtFirstKeyOfBlock(BlockIndexEntry
indexEntry) {
* afterward
* @param numEntries the number of entries in the root index block
* @param levels the level of the indexes
- * @return
+ * @return single/multiple-level data block index
*/
private TreeMap<Key, BlockIndexEntry> readDataBlockIndex(HFileBlockReader
rootBlockReader,
- int numEntries,
- int levels) throws
IOException {
+ int numEntries,
+ int levels) throws IOException {
Review Comment:
Similar on the indentation
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java:
##########
@@ -527,6 +527,33 @@ public final class HoodieMetadataConfig extends
HoodieConfig {
.withDocumentation("Max size in MB below which metadata file (HFile)
will be downloaded "
+ "and cached entirely for the HFileReader.");
+ public static final ConfigProperty<Boolean>
METADATA_HFILE_BLOCK_CACHE_ENABLED = ConfigProperty
+ .key(METADATA_PREFIX + ".hfile.block.cache.enabled")
+ .defaultValue(false)
+ .markAdvanced()
+ .sinceVersion("1.1.0")
+ .withDocumentation("Enable HFile block-level caching for metadata files.
This caches frequently "
+ + "accessed HFile blocks in memory to reduce I/O operations during
metadata queries. "
+ + "Improves performance for workloads with repeated metadata access
patterns.");
+
+ public static final ConfigProperty<Integer> METADATA_HFILE_BLOCK_CACHE_SIZE
= ConfigProperty
+ .key(METADATA_PREFIX + ".hfile.block.cache.size")
+ .defaultValue(100)
+ .markAdvanced()
+ .sinceVersion("1.1.0")
+ .withDocumentation("Maximum number of HFile blocks to cache in memory
per metadata file reader. "
+ + "Higher values improve cache hit rates but consume more memory. "
+ + "Only effective when hfile.block.cache.enabled is true.");
+
+ public static final ConfigProperty<Integer>
METADATA_HFILE_BLOCK_CACHE_TTL_MINUTES = ConfigProperty
+ .key(METADATA_PREFIX + ".hfile.block.cache.ttl.minutes")
+ .defaultValue(60)
+ .markAdvanced()
+ .sinceVersion("1.1.0")
+ .withDocumentation("Time-to-live (TTL) in minutes for cached HFile
blocks. Blocks are evicted "
+ + "from the cache after this duration to prevent memory leaks. "
+ + "Only effective when hfile.block.cache.enabled is true.");
Review Comment:
The `HoodieMetadataConfig` holds mostly write configs. Should we just use
`HFileReaderConfig` and the same configuration keys for controlling the HFile
block cache behavior (or define the `ConfigProperty` in `HoodieReaderConfig`
and use the same config keys)?
##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReaderImpl.java:
##########
@@ -281,7 +282,7 @@ Map<Key, BlockIndexEntry> getDataBlockIndexMap() {
* @throws IOException upon error.
*/
private static HFileTrailer readTrailer(SeekableDataInputStream stream,
- long fileSize) throws IOException {
+ long fileSize) throws IOException {
Review Comment:
nit: indentation does not follow the checkstyle
##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReaderImpl.java:
##########
@@ -395,4 +403,4 @@ private TreeMap<Key, BlockIndexEntry>
readDataBlockIndex(HFileBlockReader rootBl
// (6) Returns the combined index entry map
return blockIndexEntryMap;
}
-}
+}
Review Comment:
nit: add newline. Check other files too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]