[jira] [Updated] (HDDS-4473) Reduce number of sortDatanodes RPC calls

Attila Doroszlai (Jira) Tue, 01 Dec 2020 07:59:04 -0800


     [ 
https://issues.apache.org/jira/browse/HDDS-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Attila Doroszlai updated HDDS-4473:
-----------------------------------
    Fix Version/s: 1.1.0
       Resolution: Done
           Status: Resolved  (was: Patch Available)

> Reduce number of sortDatanodes RPC calls
> ----------------------------------------
>
>                 Key: HDDS-4473
>                 URL: https://issues.apache.org/jira/browse/HDDS-4473
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: OM
>            Reporter: Attila Doroszlai
>            Assignee: Attila Doroszlai
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.1.0
>
>
> {{KeyManagerImpl#listStatus}} and the {{sortDatanodeInPipeline}} helper 
> method sort datanodes using individual RPC call for each key location info.
> {code:title=https://github.com/apache/ozone/blob/d0aa34c4afae21538c6c6225f029c1d1c4c4bafd/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L2312-L2328}
>   private void sortDatanodeInPipeline(OmKeyInfo keyInfo, String 
> clientMachine) {
>     if (keyInfo != null && clientMachine != null && !clientMachine.isEmpty()) 
> {
>       for (OmKeyLocationInfoGroup key : keyInfo.getKeyLocationVersions()) {
>         key.getLocationList().forEach(k -> {
>           List<DatanodeDetails> nodes = k.getPipeline().getNodes();
>           if (nodes == null || nodes.isEmpty()) {
>             LOG.warn("Datanodes for pipeline {} is empty",
>                 k.getPipeline().getId().toString());
>             return;
>           }
>           List<String> nodeList = new ArrayList<>();
>           nodes.stream().forEach(node ->
>               nodeList.add(node.getUuidString()));
>           try {
>             List<DatanodeDetails> sortedNodes = scmClient.getBlockClient()
>                 .sortDatanodes(nodeList, clientMachine);
>             k.getPipeline().setNodesInOrder(sortedNodes);
> {code}
> Problems, possible improvements:
> # All location versions are processed.  Would it be enough to process only 
> the "latest" version, which is used for read?
> # Each key location is queried separately, even if the same pipeline was 
> already updated in a previous request.  Could be improved by keeping track of 
> processed pipelines.
> # Further improvement may be possible by sending a single {{sortDatanodes}} 
> request for all datanodes in all relevant pipelines, then creating the 
> per-pipeline lists locally.
> CC [~bharat]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-4473) Reduce number of sortDatanodes RPC calls

Reply via email to