[jira] [Updated] (HDFS-10206) Datanodes not sorted properly by distance when the reader isn't a datanode

2016-12-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-10206:
---
Fix Version/s: 3.0.0-alpha2

> Datanodes not sorted properly by distance when the reader isn't a datanode
> --
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nanda kumar
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10206-branch-2.8.003.patch, HDFS-10206.000.patch, 
> HDFS-10206.001.patch, HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) Datanodes not sorted properly by distance when the reader isn't a datanode

2016-12-07 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-10206:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

+1. Thanks [~nandakumar131] for the contribution. I have committed the patch to 
trunk and branch-2.

> Datanodes not sorted properly by distance when the reader isn't a datanode
> --
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Fix For: 2.9.0
>
> Attachments: HDFS-10206-branch-2.8.003.patch, HDFS-10206.000.patch, 
> HDFS-10206.001.patch, HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) Datanodes not sorted properly by distance when the reader isn't a datanode

2016-12-07 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-10206:
---
Summary: Datanodes not sorted properly by distance when the reader isn't a 
datanode  (was: datanodes not sorted properly by distance if the reader isn't a 
datanode)

> Datanodes not sorted properly by distance when the reader isn't a datanode
> --
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206-branch-2.8.003.patch, HDFS-10206.000.patch, 
> HDFS-10206.001.patch, HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org