[ 
https://issues.apache.org/jira/browse/FLINK-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002693#comment-17002693
 ] 

Yun Tang commented on FLINK-15368:
----------------------------------

The basic idea of end-to-end test for controlling RocksDB memory usage is to 
expose RocksDB native metrics and print in logs to track the memory usage just 
like {{wait_oper_metric_num_in_records}} bash function used in many end-to-end 
tests. There still needs something else to do like introducing [block cache 
properties|https://github.com/facebook/rocksdb/blob/afa2420c2bf0304a4b8796cab219e859146cc031/include/rocksdb/db.h#L790]
 into Flink RocksDB native metrics.

I have implemented to expose block cache metrics in my private branch. However, 
I found with a simplified version of {{DataStreamAllroundTestProgram}}, the 
block cache usage would easily exceed the capacity due to the large pinned 
usage.

I have tried to avoid to pin L0 and top level index & filter but it still 
existed. Then I try to allocate the LRUCache with {{strictCapacityLimit=true}} 
property. However, task manager would easily crash due to core dump of RocksDB. 
Still Investigating.

> Add end-to-end test for controlling RocksDB memory usage
> --------------------------------------------------------
>
>                 Key: FLINK-15368
>                 URL: https://issues.apache.org/jira/browse/FLINK-15368
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>    Affects Versions: 1.10.0
>            Reporter: Yu Li
>            Assignee: Yun Tang
>            Priority: Critical
>             Fix For: 1.10.0
>
>
> We need to add an end-to-end test to make sure the RocksDB memory usage 
> control works well, especially under the slot sharing case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to