[ 
https://issues.apache.org/jira/browse/FLINK-32833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753650#comment-17753650
 ] 

Yue Ma commented on FLINK-32833:
--------------------------------

hi [~yunta]  thank you for your replying.
I understand the original purpose of these codes. These configurations can help 
we limit memory usage better. 

What I want to say is that the user may need to decide to configure these 
parameters instead of hardcoding them in the flink code. In our previous tests, 
we found that indexAndFiltlers has a great impact on performance, especially in 
hdd environment. In the current flink version, if the user uses shared memory 
at the same time and wants to ensure high performance, it may also need to set 
an appropriate WRITE_BUFFER_RATIO or HIGH_PRIORITY_POOL_RATIO , which may be 
difficult for the user mode. In other words, if the user only wants to put the 
datablock in the cache, and wants the meta information of indexAndFilter to be 
resident in memory, it also sounds reasonable. What do you think ? 

> Rocksdb CacheIndexAndFilterBlocks must be true when using shared memory
> -----------------------------------------------------------------------
>
>                 Key: FLINK-32833
>                 URL: https://issues.apache.org/jira/browse/FLINK-32833
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.17.1
>            Reporter: Yue Ma
>            Priority: Major
>
> Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is 
> used, blockBasedTableConfig will add the following configuration by default.
> {code:java}
> blockBasedTableConfig.setBlockCache(blockCache);
> blockBasedTableConfig.setCacheIndexAndFilterBlocks(true);
> blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true);
> blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);{code}
> In my understanding, these configurations can help flink better manage the 
> memory of rocksdb and save some memory overhead in some scenarios. But this 
> may not be the best practice, mainly for the following reasons:
> 1. After CacheIndexAndFilterBlocks is set to true, it may cause index and 
> filter miss when reading, resulting in performance degradation.
> 2. These parameters may not be bound together with whether shared memory is 
> used, or some configurations should be supported separately to decide whether 
> to enable these features



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to