[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-06 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-10206:
--
Attachment: HDFS-10206-branch-2.8.003.patch

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206-branch-2.8.003.patch, HDFS-10206.000.patch, 
> HDFS-10206.001.patch, HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-06 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-10206:
---
Status: Patch Available  (was: Open)

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-04 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-10206:
--
Attachment: HDFS-10206.003.patch

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-25 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-10206:
--
Attachment: HDFS-10206.002.patch

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-21 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-10206:
--
Attachment: HDFS-10206.001.patch

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-09 Thread Nandakumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDFS-10206:
--
Attachment: HDFS-10206.000.patch

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-08 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10206:
-
Assignee: Nandakumar

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-03-28 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-10206:
---
Description: 
If the DFSClient machine is not a datanode, but it shares its rack with some 
datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
might not put the local-rack datanodes at the beginning of the sorted list. 
That is because the function didn't call {{networktopology.add(client);}} to 
properly set the node's parent node; something required by 
{{networktopology.sortByDistance}} to compute distance between two nodes in the 
same topology tree.

Another issue with {{networktopology.sortByDistance}} is it only distinguishes 
local rack from remote rack, but it doesn't support general distance 
calculation to tell how remote the rack is.

{noformat}
NetworkTopology.java
  protected int getWeight(Node reader, Node node) {
// 0 is local, 1 is same rack, 2 is off rack
// Start off by initializing to off rack
int weight = 2;
if (reader != null) {
  if (reader.equals(node)) {
weight = 0;
  } else if (isOnSameRack(reader, node)) {
weight = 1;
  }
}
return weight;
  }
{noformat}

HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
address another issue. Regardless of where we do the sorting, we still need fix 
the issues outline here.

Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
used by DatanodeManager and requires Nodes stored in the topology to be 
{{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
pollute the  NetworkTopology if we plan to fix it on the server side.

  was:
If the DFSClient machine is not a datanode, but it shares its rack with some 
datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
might not put the local-rack datanodes at the beginning of the sorted list. 
That is because the function didn't call {{networktopology.add(client);}} to 
properly set the node's parent node; something required by 
{{networktopology.sortByDistance}} to compute distance between two nodes in the 
same topology tree.

Another issue with {{networktopology.sortByDistance}} is it only distinguishes 
local rack from remote rack, but it doesn't support general distance 
calculation to tell how remote the rack is.

{noformat}
NetworkTopology.java
  protected int getWeight(Node reader, Node node) {
// 0 is local, 1 is same rack, 2 is off rack
// Start off by initializing to off rack
int weight = 2;
if (reader != null) {
  if (reader.equals(node)) {
weight = 0;
  } else if (isOnSameRack(reader, node)) {
weight = 1;
  }
}
return weight;
  }
{noformat}

HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
address another issue. Regardless of where we do the sorting, we still fix the 
issues outline here.


> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-03-24 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-10206:
---
Issue Type: Improvement  (was: Bug)

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still fix 
> the issues outline here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)