[ 
https://issues.apache.org/jira/browse/HDDS-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768072#comment-17768072
 ] 

Ritesh Shukla commented on HDDS-9272:
-------------------------------------

[~Sammi] In addition to the drop in performance, making the read load on SCM to 
be the same as OM breaks the architectural benefit of splitting background and 
foreground processing. This is a big no.

> Performance improvement for OM's sort datanodes
> -----------------------------------------------
>
>                 Key: HDDS-9272
>                 URL: https://issues.apache.org/jira/browse/HDDS-9272
>             Project: Apache Ozone
>          Issue Type: Improvement
>    Affects Versions: 1.4.0
>            Reporter: Duong
>            Assignee: Tanvi Penumudy
>            Priority: Critical
>              Labels: ozone-performance, pull-request-available
>         Attachments: Screenshot 2023-09-12 at 9.52.33 AM.png, 
> sortDatanode_flameGraph.html
>
>
> h2. Problem
> After HDDS-8300, get key metadata API in OM sorts datanodes to allow clients 
> to optimize data I/O based on locality. However, today this requires OM to 
> make additional calls to SCM to sort the datanodes. These calls contribute 
> more than 80% of the getKeyInfo latency and reduce the peak pure read OPPS 
> from *100K* to {*}25K{*}. 
> !Screenshot 2023-09-12 at 9.52.33 AM.png|width=754,height=557!
> h2. Optimization
> Today, the get key metadata API makes 2 separate calls to SCM, one to get 
> container pipelines and the second to sort datanodes in the container 
> pipelines (see 
> [KeyManagerImpl#getKeyInfo|https://github.com/apache/ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1946-L1959]):
>  
> {code:java}
>     if (!args.isHeadOp()) {
>     ...
>       // get container pipeline info from cache.
>       captureLatencyNs(metrics.getGetKeyInfoRefreshLocationLatencyNs(),
>           () -> refreshPipelineFromCache(value,
>               args.isForceUpdateContainerCacheFromSCM()));
>       if (args.getSortDatanodes()) {
>         sortDatanodes(clientAddress, value);
>       }
> {code}
> The two calls can be combined into one and the result is cached in the 
> existing containerLocationCache in ScmClient.
> Some details of the implementation:
> # A new SCM API, called `Pipeline getContainerPipeline(long containerId, 
> boolean sortDatanodes)` is added to `SCMClientProtocolServer`.   When the 
> second argument is true, the datanodes in the result pipeline are sorted by 
> the logic in `SCMBlockProtocolServer#sortDatanodes`. Another API `Map<Long, 
> Pipeline> getContainerPipelineBatch(List<Long> containerIds, boolean 
> sortDatanodes)` is also added for batch calls. It's important to implement 
> the changes in new API leave existing SCM API untouched for compatibility.
> # In `org.apache.hadoop.ozone.om.ScmClient`, the read-through 
> `containerLocationCache` now uses the two new APIs to load container 
> pipelines instead of `getContainerWithPipeline` and 
> `getContainerWithPipelineBatch`. See 
> `org.apache.hadoop.ozone.om.ScmClient#createContainerLocationCache`. The 
> cache key of `containerLocationCache` is now a composition 
> `PipelineCacheKey(long containerId, boolean datanodeSorted)`.
> # ScmClient#getContainerLocations is changed to 
> `getContainerLocations(Iterable<Long> containerIds, boolean forceRefresh, 
> boolean sortDatanodes)`. getKeyInfo API now uses this and never calls 
> sortDatanodes separately. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to