[
https://issues.apache.org/jira/browse/HDFS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nikola Vujic resolved HDFS-5184.
--------------------------------
Resolution: Done
This is fixed in HDP 2 with the new implementation of the block placement
policy with node group.
> BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes
> is true
> ------------------------------------------------------------------------------------
>
> Key: HDFS-5184
> URL: https://issues.apache.org/jira/browse/HDFS-5184
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Nikola Vujic
> Priority: Minor
>
> If avoidStaleNodes is true then choosing targets is potentially done in two
> attempts. If we don't find enough targets to place replicas in the first
> attempt then second attempt is invoked with the aim to use stale nodes in
> order to find the remaining targets. This second attempt breaks node group
> rule of not having two replicas in the same node group.
> Invocation of the second attempt looks like this:
> {code}
> DatanodeDescriptor choseTarget(excludeNodes,...) {
> oldExcludedNodes=new HashMap<Node, Node>(excludedNodes);
> // first attempt
> // if we don't find enough targets then
> if (avoidStaleNodes) {
> for (Node node : results) {
> oldExcludedNodes.put(node, node);
> }
> numOfReplicas = totalReplicasExpected - results.size();
> return chooseTarget(numOfReplicas, writer, oldExcludedNodes, blocksize,
> maxNodesPerRack, results, false);
> }
> }
> {code}
> So, all excluded nodes from the first attempt which are neither in
> oldExcludedNodes nor in results will be ignored and the second invocation of
> chooseTarget will use an incomplete set of excluded nodes. For example, if we
> have next topology:
> dn1 -> /d1/r1/n1
> dn2 -> /d1/r1/n1
> dn3 -> /d1/r1/n2
> dn4 -> /d1/r1/n2
> and if we want to choose 3 targets with avoidStaleNodes=true then in the
> first attempt we will choose 2 targets since we have only two node groups.
> Let's say we choose dn1 and dn3. Then, we will add dn1 and dn3 in the
> oldExcudedNodes and use that set of excluded nodes in the second attempt.
> This set of excluded nodes is incomplete and allows us to select dn2 and dn4
> in the second attempt which should not be selected due to node group
> awareness but it is happening in the current code!
> Repro:
> - add
> CONF.setBoolean(DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY,
> true); to TestReplicationPolicyWithNodeGroup.
> - testChooseMoreTargetsThanNodeGroups() should fail.
--
This message was sent by Atlassian JIRA
(v6.2#6252)