[ https://issues.apache.org/jira/browse/FLINK-32833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753650#comment-17753650 ]
Yue Ma commented on FLINK-32833: -------------------------------- hi [~yunta] thank you for your replying. I understand the original purpose of these codes. These configurations can help we limit memory usage better. What I want to say is that the user may need to decide to configure these parameters instead of hardcoding them in the flink code. In our previous tests, we found that indexAndFiltlers has a great impact on performance, especially in hdd environment. In the current flink version, if the user uses shared memory at the same time and wants to ensure high performance, it may also need to set an appropriate WRITE_BUFFER_RATIO or HIGH_PRIORITY_POOL_RATIO , which may be difficult for the user mode. In other words, if the user only wants to put the datablock in the cache, and wants the meta information of indexAndFilter to be resident in memory, it also sounds reasonable. What do you think ? > Rocksdb CacheIndexAndFilterBlocks must be true when using shared memory > ----------------------------------------------------------------------- > > Key: FLINK-32833 > URL: https://issues.apache.org/jira/browse/FLINK-32833 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends > Affects Versions: 1.17.1 > Reporter: Yue Ma > Priority: Major > > Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is > used, blockBasedTableConfig will add the following configuration by default. > {code:java} > blockBasedTableConfig.setBlockCache(blockCache); > blockBasedTableConfig.setCacheIndexAndFilterBlocks(true); > blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true); > blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);{code} > In my understanding, these configurations can help flink better manage the > memory of rocksdb and save some memory overhead in some scenarios. But this > may not be the best practice, mainly for the following reasons: > 1. After CacheIndexAndFilterBlocks is set to true, it may cause index and > filter miss when reading, resulting in performance degradation. > 2. These parameters may not be bound together with whether shared memory is > used, or some configurations should be supported separately to decide whether > to enable these features -- This message was sent by Atlassian Jira (v8.20.10#820010)