[ https://issues.apache.org/jira/browse/HDFS-10968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jing Zhao updated HDFS-10968: ----------------------------- Resolution: Fixed Fix Version/s: 3.0.0-alpha2 Status: Resolved (was: Patch Available) Thanks for the review, Nicholas! I've committed this into trunk. > BlockManager#isInNewRack should consider decommissioning nodes > -------------------------------------------------------------- > > Key: HDFS-10968 > URL: https://issues.apache.org/jira/browse/HDFS-10968 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, namenode > Affects Versions: 3.0.0-alpha1 > Reporter: Jing Zhao > Assignee: Jing Zhao > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10968.000.patch > > > For an EC block, it is possible we have enough internal blocks but without > enough racks. The current reconstruction code calls > {{BlockManager#isInNewRack}} to check if the target node can increase the > total rack number for the case, which compares the target node's rack with > source node racks: > {code} > for (DatanodeDescriptor src : srcs) { > if (src.getNetworkLocation().equals(target.getNetworkLocation())) { > return false; > } > } > {code} > However here the {{srcs}} may include a decommissioning node, in which case > we should allow the target node to be in the same rack with it. > For e.g., suppose we have 11 nodes: h1 ~ h11, which are located in racks r1, > r1, r2, r2, r3, r3, r4, r4, r5, r5, r6, respectively. In case that an EC > block has 9 live internal blocks on (h1~h8 + h11), and one internal block on > h9 which is to be decommissioned. The current code will not choose h10 for > reconstruction because isInNewRack thinks h10 is on the same rack with h9. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org