[jira] [Updated] (HDDS-7543) SCM memory optimization

Duong (Jira) Sun, 27 Nov 2022 01:20:27 -0800


     [ 
https://issues.apache.org/jira/browse/HDDS-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Duong updated HDDS-7543:
------------------------
    Description: 
I recently did a simulation test to stress SCM handling of 5K of nodes and a 
hundred million containers. The test shows that for each container, SCM seems 
to spend around 1.5kb of memory for the metadata cache, hence 150GB for 100M 
containers.

TODO: put graphs

Also GC count and cost also linearly grow with containers.

TODO: put graphs.

This ticket tracks all the efforts to micro-optimize SCM memory usage, both 
long-term (cache) and short-time (like temporary variables, protobuf serialized 
objects).

 

Micro-optimizations can sound a bit tedious, but below are some numbers to get 
into consideration, given the fact that SCM is not horizontally scalable.
 * In the context of 100M containers, every 10 bytes saved in container-related 
cache is 1GB of RAM saved.
 * 5K datanodes results in 10K heartbeats per minute, hence ~167 per second 
consistently. It's not a lot, but in an actively updated cluster, the heartbeat 
message size and the indicated workload introduce significant work for GC. 

 

 

  was:
I recently did a simulation test to stress SCM handling of 5K of nodes and a 
hundred million containers. The test shows that for each container, SCM seems 
to spend around 1.5kb of memory for the metadata cache, hence 150GB for 100M 
containers.

TODO: put graphs

Also GC count and cost also linearly grow with containers.

TODO: put graphs.

 

This ticket tracks all the efforts to micro-optimize SCM memory usage, both 
long-term (cache) and short-time (like temporary variables, protobuf serialized 
objects).


> SCM memory optimization
> -----------------------
>
>                 Key: HDDS-7543
>                 URL: https://issues.apache.org/jira/browse/HDDS-7543
>             Project: Apache Ozone
>          Issue Type: Improvement
>    Affects Versions: 1.3.0
>            Reporter: Duong
>            Priority: Major
>
> I recently did a simulation test to stress SCM handling of 5K of nodes and a 
> hundred million containers. The test shows that for each container, SCM seems 
> to spend around 1.5kb of memory for the metadata cache, hence 150GB for 100M 
> containers.
> TODO: put graphs
> Also GC count and cost also linearly grow with containers.
> TODO: put graphs.
> This ticket tracks all the efforts to micro-optimize SCM memory usage, both 
> long-term (cache) and short-time (like temporary variables, protobuf 
> serialized objects).
>  
> Micro-optimizations can sound a bit tedious, but below are some numbers to 
> get into consideration, given the fact that SCM is not horizontally scalable.
>  * In the context of 100M containers, every 10 bytes saved in 
> container-related cache is 1GB of RAM saved.
>  * 5K datanodes results in 10K heartbeats per minute, hence ~167 per second 
> consistently. It's not a lot, but in an actively updated cluster, the 
> heartbeat message size and the indicated workload introduce significant work 
> for GC. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-7543) SCM memory optimization

Reply via email to