[ 
https://issues.apache.org/jira/browse/FLINK-32833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yue Ma updated FLINK-32833:
---------------------------
    Description: 
Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is 
used, blockBasedTableConfig will add the following configuration by default.
{code:java}
blockBasedTableConfig.setBlockCache(blockCache);
blockBasedTableConfig.setCacheIndexAndFilterBlocks(true);
blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true);
blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);{code}
In my understanding, these configurations can help flink better manage the 
memory of rocksdb and save some memory overhead in some scenarios. But this may 
not be the best practice, mainly for the following reasons:
1. After CacheIndexAndFilterBlocks is set to true, it may cause index and 
filter miss when reading, resulting in performance degradation.
2. These parameters may not be bound together with whether shared memory is 
used, or some configurations should be supported separately to decide whether 
to enable these features

  was:
Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is 
used, blockBasedTableConfig will add the following configuration by default.

            blockBasedTableConfig.setBlockCache(blockCache);
            blockBasedTableConfig.setCacheIndexAndFilterBlocks(true);
            
blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true);
            blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);

In my understanding, these configurations can help flink better manage the 
memory of rocksdb and save some memory overhead in some scenarios. But this may 
not be the best practice, mainly for the following reasons:
1. After CacheIndexAndFilterBlocks is set to true, it may cause index and 
filter miss when reading, resulting in performance degradation.
2. These parameters may not be bound together with whether shared memory is 
used, or some configurations should be supported separately to decide whether 
to enable these features


> Rocksdb CacheIndexAndFilterBlocks must be true when using shared memory
> -----------------------------------------------------------------------
>
>                 Key: FLINK-32833
>                 URL: https://issues.apache.org/jira/browse/FLINK-32833
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.17.1
>            Reporter: Yue Ma
>            Priority: Major
>
> Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is 
> used, blockBasedTableConfig will add the following configuration by default.
> {code:java}
> blockBasedTableConfig.setBlockCache(blockCache);
> blockBasedTableConfig.setCacheIndexAndFilterBlocks(true);
> blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true);
> blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);{code}
> In my understanding, these configurations can help flink better manage the 
> memory of rocksdb and save some memory overhead in some scenarios. But this 
> may not be the best practice, mainly for the following reasons:
> 1. After CacheIndexAndFilterBlocks is set to true, it may cause index and 
> filter miss when reading, resulting in performance degradation.
> 2. These parameters may not be bound together with whether shared memory is 
> used, or some configurations should be supported separately to decide whether 
> to enable these features



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to