[jira] [Work logged] (HDFS-16510) Fix EC decommission when rack is not enough

ASF GitHub Bot (Jira) Thu, 17 Mar 2022 02:16:06 -0700


     [ 
https://issues.apache.org/jira/browse/HDFS-16510?focusedWorklogId=743037&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-743037
 ]


ASF GitHub Bot logged work on HDFS-16510:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Mar/22 09:15
            Start Date: 17/Mar/22 09:15
    Worklog Time Spent: 10m 
      Work Description: cndaimin opened a new pull request #4078:
URL: https://github.com/apache/hadoop/pull/4078


   The decommission always fail when we start decommission multiple nodes on a 
cluster whose racks is not enough, a cluster with 6 racks to deploy RS-6-3, for 
example.
   
   We find that those decommission nodes cover at least a rack, it's actulaly 
like we are decommission one or more racks. And rack decommission is not well 
supported currently, especially for cluster whose racks is not enough already.
   
   In this patch, we add `numOfExcludedRacks` to indicate how many racks are in 
decommission(excluded) and fix the calculation of 
`BlockPlacementStatusDefault#getAdditionalReplicasRequired`. And in 
`ErasureCodingWork#addTaskToDatanode`, we adjust the process order as we should 
take care of decommission first, especially when rack is not enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 743037)
    Remaining Estimate: 0h
            Time Spent: 10m

> Fix EC decommission when rack is not enough
> -------------------------------------------
>
>                 Key: HDFS-16510
>                 URL: https://issues.apache.org/jira/browse/HDFS-16510
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: block placement, ec
>    Affects Versions: 3.3.1, 3.3.2
>            Reporter: daimin
>            Assignee: daimin
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The decommission always fail when we start decommission multiple nodes on a 
> cluster whose racks is not enough, a cluster with 6 racks to deploy RS-6-3, 
> for example.
> We find that those decommission nodes cover at least a rack, it's actulaly 
> like we are decommission one or more racks. And rack decommission is not well 
> supported currently, especially for cluster whose racks is not enough already.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDFS-16510) Fix EC decommission when rack is not enough

Reply via email to