Donatien created FLINK-29402:
--------------------------------

             Summary: Add USE_DIRECT_READ configuration parameter for RocksDB
                 Key: FLINK-29402
                 URL: https://issues.apache.org/jira/browse/FLINK-29402
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / State Backends
    Affects Versions: 1.15.2
            Reporter: Donatien
             Fix For: 1.15.2


RocksDB allows the use of DirectIO for read operations to bypass the Linux Page 
Cache. To understand the impact of Linux Page Cache on performance, one can run 
a heavy workload on a single-tasked Task Manager with a container memory limit 
identical to the TM process memory. Running this same workload on a TM with no 
container memory limit will result in better performances but with the host 
memory exceeding the TM requirement.

Linux Page Cache are of course useful but can give false results when 
benchmarking the Managed Memory used by RocksDB. DirectIO is typically enabled 
for benchmarks on working set estimation [Zwaenepoel et 
al.|[https://arxiv.org/abs/1702.04323].]

I propose to add a configuration key allowing users to enable the use of 
DirectIO for reads thanks to the RocksDB API. This configuration would be 
disabled by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to