Myasuka opened a new pull request #10930: [FLINK-15368][e2e] Add end-to-end 
test for controlling RocksDB memory usage
URL: https://github.com/apache/flink/pull/10930
 
 
   
   ## What is the purpose of the change
   
   Add end-to-end test for controlling RocksDB memory usage. This job has 4 
states in 4 different operator, and all the operators are shared in one slot.
   
   **NOTE:** This end-to-end test could be a unstable one when too many 
unflushed immutable mem-tables. I wrote [a doc to explain how write buffer 
manager works in 
RocksDB.](https://docs.google.com/document/d/1_4Brwy2Axzzqu7SJ4hLLl92hVCpeRlVEG-fj8KsTMUo/edit#heading=h.f5wfmsmpemd0)
 In this doc I explained the most total memory usage could be much higher than 
expected in the **worst** case.
   
   Below is the general test result:
   1GB TM, 2 slot each without memory control.
   When we do not control memory usage over RocksDB instances, the total memory 
should be summed as `block-cache-usgae` + `total-mem-table` from all 4 states. 
As you can see, the total memory usage in one slot could be 400MB+
   <img width="1319" alt="111" 
src="https://user-images.githubusercontent.com/1709104/72965411-31cdaa80-3df7-11ea-843d-1565d7b7b89d.png";>
   
   1GB TM, 2 slot each has 161061276 bytes of managed off-heap memory
   Since we use the same cache to share among all rocksDB instances, the total 
memory usage is the block cache usage. As you can see, the memory usage could 
be near the vicinity of 161061276 bytes.
   <img width="1266" alt="image" 
src="https://user-images.githubusercontent.com/1709104/72965622-ce904800-3df7-11ea-8a04-b818f67929c4.png";>
   
   
   
   ## Brief change log
   Add end-to-end test for controlling RocksDB memory usage.
   
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   
     - Added `RocksDBStateMemoryControlTestProgram` to verify end-to-end.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to