[
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696454#comment-15696454
]
Nandakumar commented on HDFS-10206:
-----------------------------------
Thanks for the comment [~mingma]
{quote}
Any idea why 000.patch makes difference for the "Same Node" and "DataNode in
same rack"?
{quote}
Out of three replica, one will be in off rack datanode which is causing the
difference.
Based on the comment {{NetworkTopology.getWeightUsingNetworkLocation}} and
{{NetworkTopology.normalizeNetworkLocationPath}} are changed to static, instead
of calling {{NetworkTopology.getDistance}} from {{NetworkTopology.getWeight}}
logic is added in {{getWeight}} to calculate the weight, which also takes care
of isOnSameRack case.
Weight calculation after this patch
- 0 for same node
- 2 for same rack
- After that each level on each node increases the weight by 1
Please review [^HDFS-10206.002.patch]
> getBlockLocations might not sort datanodes properly by distance
> ---------------------------------------------------------------
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Ming Ma
> Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch,
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}}
> might not put the local-rack datanodes at the beginning of the sorted list.
> That is because the function didn't call {{networktopology.add(client);}} to
> properly set the node's parent node; something required by
> {{networktopology.sortByDistance}} to compute distance between two nodes in
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only
> distinguishes local rack from remote rack, but it doesn't support general
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
> protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
> if (reader.equals(node)) {
> weight = 0;
> } else if (isOnSameRack(reader, node)) {
> weight = 1;
> }
> }
> return weight;
> }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to
> address another issue. Regardless of where we do the sorting, we still need
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object
> used by DatanodeManager and requires Nodes stored in the topology to be
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't
> pollute the NetworkTopology if we plan to fix it on the server side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]