[jira] [Work logged] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

ASF GitHub Bot (Jira) Sat, 11 Jun 2022 01:03:50 -0700


     [ 
https://issues.apache.org/jira/browse/HDFS-16613?focusedWorklogId=780474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780474
 ]


ASF GitHub Bot logged work on HDFS-16613:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Jun/22 07:59
            Start Date: 11/Jun/22 07:59
    Worklog Time Spent: 10m 
      Work Description: hi-adachi commented on PR #4398:
URL: https://github.com/apache/hadoop/pull/4398#issuecomment-1152877937

   It would be good to add a test case where decommission in progress to 
testPendingRecoveryTasks.
   ```
      private void verifyPendingRecoveryTasks(
          int numReplicationBlocks, int numECBlocks,
   -      int maxTransfers, int numReplicationTasks, int numECTasks)
   +      int maxTransfers, int maxTransfersHardLimit, int numReplicationTasks, 
int numECTasks, boolean isDecommissioning)
          throws IOException {
        FSNamesystem fsn = Mockito.mock(FSNamesystem.class);
        Mockito.when(fsn.hasWriteLock()).thenReturn(true);
        Configuration conf = new Configuration();
        conf.setInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY, 
maxTransfers);
   +    
conf.setInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY, 
maxTransfersHardLimit);
        DatanodeManager dm = Mockito.spy(mockDatanodeManager(fsn, conf));
    
        DatanodeDescriptor nodeInfo = Mockito.mock(DatanodeDescriptor.class);
        Mockito.when(nodeInfo.isRegistered()).thenReturn(true);
        Mockito.when(nodeInfo.getStorageInfos())
            .thenReturn(new DatanodeStorageInfo[0]);
   +    
Mockito.when(nodeInfo.isDecommissionInProgress()).thenReturn(isDecommissioning);
   ```
   ```
   verifyPendingRecoveryTasks(30, 30, 20, 30, 15, 15, true);
   ```




Issue Time Tracking
-------------------

    Worklog Id:     (was: 780474)
    Time Spent: 40m  (was: 0.5h)

> EC: Improve performance of decommissioning dn with many ec blocks
> -----------------------------------------------------------------
>
>                 Key: HDFS-16613
>                 URL: https://issues.apache.org/jira/browse/HDFS-16613
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ec, erasure-coding, namenode
>    Affects Versions: 3.4.0
>            Reporter: caozhiqiang
>            Assignee: caozhiqiang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

Reply via email to