[ 
https://issues.apache.org/jira/browse/FLINK-32833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753143#comment-17753143
 ] 

Yun Tang commented on FLINK-32833:
----------------------------------

Hi [~mayuehappy], {{setCacheIndexAndFilterBlocks}}  is necessary as we want to 
limit memory usage. We would cost three parts (index & filter blocks, 
memtables, data blocks) in block cache to limit the whole memory usage. Setting 
this option is for stability instead of performance.
And the other two options {{setCacheIndexAndFilterBlocksWithHighPriority}} and 
{{setPinL0FilterAndIndexBlocksInCache}} are introduced to mitigate the 
performance regression when we cache the filter and index blocks.

> Rocksdb CacheIndexAndFilterBlocks must be true when using shared memory
> -----------------------------------------------------------------------
>
>                 Key: FLINK-32833
>                 URL: https://issues.apache.org/jira/browse/FLINK-32833
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.17.1
>            Reporter: Yue Ma
>            Priority: Major
>
> Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is 
> used, blockBasedTableConfig will add the following configuration by default.
> {code:java}
> blockBasedTableConfig.setBlockCache(blockCache);
> blockBasedTableConfig.setCacheIndexAndFilterBlocks(true);
> blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true);
> blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);{code}
> In my understanding, these configurations can help flink better manage the 
> memory of rocksdb and save some memory overhead in some scenarios. But this 
> may not be the best practice, mainly for the following reasons:
> 1. After CacheIndexAndFilterBlocks is set to true, it may cause index and 
> filter miss when reading, resulting in performance degradation.
> 2. These parameters may not be bound together with whether shared memory is 
> used, or some configurations should be supported separately to decide whether 
> to enable these features



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to