[ https://issues.apache.org/jira/browse/HDFS-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Takanobu Asanuma resolved HDFS-16333. ------------------------------------- Fix Version/s: 3.4.0 3.2.4 3.3.3 Resolution: Fixed > fix balancer bug when transfer an EC block > ------------------------------------------ > > Key: HDFS-16333 > URL: https://issues.apache.org/jira/browse/HDFS-16333 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover > Reporter: qinyuren > Assignee: qinyuren > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.3 > > Attachments: image-2021-11-18-17-25-13-089.png, > image-2021-11-18-17-25-50-556.png, image-2021-11-18-17-28-03-155.png > > Time Spent: 5h 10m > Remaining Estimate: 0h > > We set the EC policy to (6+3) and we also have nodes that were > decommissioning when we executed balancer. > With the balancer running, we find many error logs as follow. > !image-2021-11-18-17-25-13-089.png|width=858,height=135! > Node A wants to transfer an EC block to node B, but we found that the block > is not on node A. The FSCK command to show the block status as follow > !image-2021-11-18-17-25-50-556.png|width=607,height=189! > In the dispatcher. getBlockList function > !image-2021-11-18-17-28-03-155.png! > > Assume that the location of the an EC block in storageGroupMap look like this > indices:[0, 1, 2, 3, 4, 5, 6, 7, 8] > node:[a, b, c, d, e, f, g, h, i] > after decommission operation, the internal block on indices[1] were > decommission to another node. > indices:[0, 1, 2, 3, 4, 5, 6, 7, 8] > node:[a, {color:#FF0000}j{color}, c, d, e, f, g, h, i] > the location of indices[1] change from node {color:#FF0000}b{color} to node > {color:#FF0000}j{color}. > > When the balancer get the block location and check it with the location in > storageGroupMap. > If a node is not found in storageGroupMap, it will not be add to block > locations. > In this case, node {color:#FF0000}j {color}will not be added to the block > locations, while the indices is not updated. > Finally, the block location may look like this, > indices:[0, 1, 2, 3, 4, 5, 6, 7, 8] > {color:#FF0000}block.location:[a, c, d, e, f, g, h, i]{color} > the location of the nodes does not match their indices > > Solution: > we should update the indices and match with the nodes > {color:#FF0000}indices:[0, 2, 3, 4, 5, 6, 7, 8]{color} > {color:#FF0000}block.location:[a, c, d, e, f, g, h, i]{color} -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org