[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15728490#comment-15728490
 ] 

Hadoop QA commented on HDFS-10206:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 5s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
56s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
23s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
35s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
40s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_121 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
17s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 12s{color} | {color:orange} root: The patch generated 1 new + 132 unchanged 
- 2 fixed = 133 total (was 134) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
58s{color} | {color:green} hadoop-common in the patch passed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}222m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_121 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.TestFileCorruption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:5af2af1 |
| JIRA Issue | HDFS-10206 |
| 

[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-06 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15728003#comment-15728003
 ] 

Nandakumar commented on HDFS-10206:
---

Thanks for the review [~mingma]. 
Have uploaded the patch on top of branch-2.8 [^HDFS-10206-branch-2.8.003.patch]

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206-branch-2.8.003.patch, HDFS-10206.000.patch, 
> HDFS-10206.001.patch, HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-06 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727676#comment-15727676
 ] 

Ming Ma commented on HDFS-10206:


Thanks [~nandakumar131]. The patch looks good. Given the patch doesn't apply 
directly for branch-2. Can you provide another patch for branch-2? You can use 
the naming convention for the branch-2 patch based on "Naming your patch" 
section in https://wiki.apache.org/hadoop/HowToContribute so that Jenkins can 
can run the precommit job.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727604#comment-15727604
 ] 

Hadoop QA commented on HDFS-10206:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
57s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 40s{color} | {color:orange} root: The patch generated 1 new + 115 unchanged 
- 2 fixed = 116 total (was 117) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m  6s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.net.TestDNS |
|   | hadoop.ipc.TestIPC |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-10206 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841666/HDFS-10206.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 225a231f262d 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a7288da |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17783/artifact/patchprocess/diff-checkstyle-root.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17783/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
| unit | 

[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-04 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15719836#comment-15719836
 ] 

Nandakumar commented on HDFS-10206:
---

Hi [~mingma], Added logic to check for identical nodes in 
getWeightUsingNetworkLocation, now it will return 0 in case of identical nodes.
Please review the HDFS-10206.003.patch

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch, HDFS-10206.003.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-03 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718536#comment-15718536
 ] 

Ming Ma commented on HDFS-10206:


ok. Maybe it isn't precise way to refer it. The "network path" comes from 
NodeBase#getPath method. Anyway, the point is the new method should return 0 in 
case of two identical nodes.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-02 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717602#comment-15717602
 ] 

Nandakumar commented on HDFS-10206:
---

No, nodes of same network path refers to nodes on same rack. 
Node#getNetworkLocation() will return path to the Node (Node name is not 
included in the path)

for a node "/dc1/rack1/datanode1", Node#getNetworkLocation() will return 
"/dc1/rack1"

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-02 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717543#comment-15717543
 ] 

Ming Ma commented on HDFS-10206:


To clarify, "two nodes of the same network path" referred to two identical 
nodes, just like how getWeight could return 0 in such case.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-02 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717398#comment-15717398
 ] 

Nandakumar commented on HDFS-10206:
---

If we return 0 for two nodes having the same network path (i.e. in same rack) 
from getWeightUsingNetworkLocation 

* getWeightUsingNetworkLocation will return 0 for same rack
* getWeight will return 2 for same rack

It will be good to have same behavior across these methods.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-02 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717370#comment-15717370
 ] 

Ming Ma commented on HDFS-10206:


Thanks [~nandakumar131]! The patches look good overall. To make the method more 
general, seems better to have getWeightUsingNetworkLocation return 0 when two 
nodes have the same network path. [~daryn] [~kihwal], any concerns about the 
added 0.1ms latency? Note this only happens for non-datanode reader scenario 
and it doesn't hold FSNamesystem lock.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-12-01 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712724#comment-15712724
 ] 

Nandakumar commented on HDFS-10206:
---

bq. nonDataNodeReader. However, it turns out NetworkTopology has several 
existing references of "datanode". So It is good to have and up to you if you 
want to fix it.
I thought nonDataNodeReader is explicit and easy to understand, and as you 
mentioned there are several existing datanode reference in NetworkTopology.

bq. So 001.patch shouldn't has difference. Do you mind confirming?
Below is the micro-benchmark for SameNode and SameRack with the 
[^HDFS-10206.002.patch] 

|| Client on || Run 1 || Run 2 || Run 3 || Run 4 || Run 5 ||
| Same Node | 126994 | 95364 | 140242 | 119920 | 113167 |
| DataNode in same rack | 91442 | 124531 | 102606 | 104960 | 142946 |

bq. Can you confirm with 0002.patch the weights? It seems to return 0, 2, 4. 
The old behavior is 0, 1, 2.

Yes, after the patch {{NetworkTopology.getWeight}} will return 0, 2, 4 ...

Below are few cases

*Same Node:*
/rack1/datanode1
/rack1/datanode1
Will return: 0

*Same Rack:*
/rack1/datanode1
/rack1/datanode2
Will return: 2

*Different Rack:*
/rack1/datanode1
/rack2/datanode3
Will return: 4

/dc1/rack1/datanode1
/default-rack/datanode4
Will return: 5

/dc1/rack1/datanode1
/dc2/rack5/datanode5
Will return: 6



> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-29 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707535#comment-15707535
 ] 

Ming Ma commented on HDFS-10206:


bq. Can you point out the variables which are to be made more generic?
nonDataNodeReader. However, it turns out NetworkTopology has several existing 
references of "datanode". So It is good to have and up to you if you want to 
fix it.

bq. With 000.patch the weight is calculated using network location for off rack 
datanodes which impacts the micro-benchmark results.
Got it. Thanks for the clarification. So 001.patch shouldn't has difference. Do 
you mind confirming?

bq. Weight calculation after this patch
Can you confirm with 0002.patch the weights? It seems to return 0, 2, 4. The 
old behavior is 0, 1, 2.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-28 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704378#comment-15704378
 ] 

Nandakumar commented on HDFS-10206:
---

bq. NetworkTopology can be used by HDFS, YARN and MAPREDUCE. It is better to 
make variable names more general.
Can you point out the variables which are to be made more generic? 

bq. But the reader should pick the closest one, either "Same Node" and 
"DataNode in same rack". Perhaps you can clarify the setup.

Benchmarking was done with default replication factor - 3 (two replicas will be 
in same rack as the writer and one will be in a different rack datanode)
{{NetworkTopology.sortByDistance}} method will call 
{{NetworkTopology.getWeight}} for every replica of the block (within 
activeLen). Out of three, at least one of the replica will be in off rack 
datanode even for "Same Node" and "DataNode in same rack". With 000.patch the 
weight is calculated using network location for off rack datanodes which 
impacts the micro-benchmark results.
Sorry if I have confused you more.

{quote}
So the weight value definition has changed. It should be fine given it isn't a 
public interface. Still NetworkTopologyWithNodeGroup has its own getWeight 
definition based on the old definition. Either we update that or keep the 
weight value.
{quote}

According to {{NetworkTopologyWithNodeGroup.getWeight}}

0 for same node
1 for same group
2 for same rack
3 for off rack 

it aligns with weight definition of this patch, with an additional intermediate 
level (1 for same group)

0 for same node
2 for same rack

for off rack in {{NetworkTopologyWithNodeGroup.getWeight}} we can call 
{{super.getWeight}} which will calculate the weight using new logic rather than 
returning 3 for all the off rack nodes.



> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-28 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702984#comment-15702984
 ] 

Ming Ma commented on HDFS-10206:


* NetworkTopology can be used by HDFS, YARN and MAPREDUCE. It is better to make 
variable names more general.

bq. Out of three replica, one will be in off rack datanode which is causing the 
difference
But the reader should pick the closest one, either "Same Node" and "DataNode in 
same rack". Perhaps you can clarify the setup.

bq. Weight calculation after this patch
So the weight value definition has changed. It should be fine given it isn't a 
public interface. Still NetworkTopologyWithNodeGroup has its own getWeight 
definition based on the old definition. Either we update that or keep the 
weight value.



> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-25 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696454#comment-15696454
 ] 

Nandakumar commented on HDFS-10206:
---

Thanks for the comment [~mingma]

{quote}
Any idea why 000.patch makes difference for the "Same Node" and "DataNode in 
same rack"?
{quote}
Out of three replica, one will be in off rack datanode which is causing the 
difference.


Based on the comment {{NetworkTopology.getWeightUsingNetworkLocation}} and 
{{NetworkTopology.normalizeNetworkLocationPath}} are changed to static, instead 
of calling {{NetworkTopology.getDistance}} from {{NetworkTopology.getWeight}} 
logic is added in {{getWeight}} to calculate the weight, which also takes care 
of isOnSameRack case.

Weight calculation after this patch
- 0 for same node
- 2 for same rack
- After that each level on each node increases the weight by 1

Please review [^HDFS-10206.002.patch]


> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch, 
> HDFS-10206.002.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-23 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692234#comment-15692234
 ] 

Ming Ma commented on HDFS-10206:


Thanks [~nanda619] for the micro benchmark and the new patch.

* Any idea why 000.patch makes difference for the "Same Node" and "DataNode in 
same rack"?
* In the context of the overall data transfer duration, the overhead of 0.1ms 
looks acceptable, especially given DatanodeManager#sortLocatedBlocks doesn't 
take FSNamesystem's lock.
* It seems getWeightUsingNetworkLocation and normalizeNetworkLocationPath can 
be static.
* getWeight function calls getDistance, which returns the distance between two 
nodes, not the weight defined as the distance between nodes and ancestors. 
Maybe we can define a new function like getDistanceToClosestCommonAncestor, 
which can also take care of the isOnSameRack case as well.
* About ReadWriteLock.readLock, it might be ok given under normal workload 
there won't be much write to NetworkTopology.


> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-21 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15683203#comment-15683203
 ] 

Nandakumar commented on HDFS-10206:
---

New [^HDFS-10206.001.patch] has been uploaded.

The logic is modifed based on [~mingma]'s comment, using 
{{DatanodeManager.sortLocatedBlock}} which already knows if the reader is a 
datanode or not, to call appropriate method in {{NetworkTopology}} for sorting 
the blocks.

Please review the patch.

{{NetworkTopology.getWeight}} uses {{NetworkTopology.getDistance}} which has 
{{ReadWriteLock.readLock()}}, any thoughts on this?

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch, HDFS-10206.001.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-19 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15679807#comment-15679807
 ] 

Nandakumar commented on HDFS-10206:
---

{quote}
Here is another option. DatanodeManager#sortLocatedBlock already knows if its a 
datanode. So we can have a new NetworkTopology#sortByDistance that supports 
check-by-reference.
{quote}
This option looks better, thanks for the suggestion [~mingma]. Will make the 
changes accordingly and upload a new patch.

In mean time I was able to do micro benchmark with [^HDFS-10206.000.patch], the 
benchmarks only measures the time taken by {{NetworkTopology.sortByDistance}}

HDFS Version: 2.7.1
Reading a single block file. (file size: 120 MB)
The values are in nanosecond.

*Without patch*

||Client on|| Run 1 || Run 2 || Run 3 || Run 4 || Run 5 ||
|Same Node| 99710 | 124014 | 134857 | 146936 | 111543 |
|DataNode in same rack| 169122 | 99805 | 124058 | 134566 | 269096 |
|DataNode in different rack| 114552 | 103003 | 153313 | 92008 | 114279 |
|Non-DataNode in same rack| 97960 | 199611 | 77948 | 101324 | 90920 |
|Non-DataNode in different rack| 93002 | 182436 | 104600 | 96434 | 138167 |

*With patch*

||Client on|| Run 1 || Run 2 || Run 3 || Run 4 || Run 5 ||
|Same Node| 121510 | 185741 | 110382 | 180451 | 132131 |
|DataNode in same rack| 182892 | 128597 | 187518 | 136754 | 385739 |
|DataNode in different rack| 201029 | 274671 | 298843 | 146709 | 154405 |
|Non-DataNode in same rack| 92687 | 182100 | 134704 | 277057 | 207532 |
|Non-DataNode in different rack| 245957 | 115076 | 203657 | 181819 | 116314 |

Below is the time taken by {{NetworkTopology.sortByDistance}} for a one GB file 
(eight blocks), the values are in nanosecond.

*Without patch*

||Client on|| Run 1 || Run 2 || Run 3 || Run 4 || Run 5 ||
|Non-DataNode in same rack| 244535 | 282273 | 216524 | 4410825 | 339375 |

*With patch*

||Client on|| Run 1 || Run 2 || Run 3 || Run 4 || Run 5 ||
|Non-DataNode in same rack| 729701 | 5801405 | 613048 | 655345 | 506294 |


> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-19 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15679692#comment-15679692
 ] 

Ming Ma commented on HDFS-10206:


bq. Any comments on using NetworkTopology.contains(node) to check and use 
NetworkTopology.getDistance(node1, node2) to get the distance in case if the 
reader is an off rack datanode?
Here is another option. DatanodeManager#sortLocatedBlock already knows if its a 
datanode. So we can have a new NetworkTopology#sortByDistance that supports 
check-by-reference.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-18 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15676336#comment-15676336
 ] 

Nandakumar commented on HDFS-10206:
---

bq. From the below code, it seems each level will increase by 2.

Yes, you're right. Distance from a node to it's parent is assumed to be 1, so 
adding the value of both the nodes will make it 2.
Sorry for the confusion on that.

{quote} 
one is the reader being a datanode in a remote rack in a large cluster; for 
that NetworkTopology already has the reader in its tree, it will be faster to 
compare parents reference.
{quote}

We can use {{NetworkTopology.contains(node)}} to check if the reader is a 
datanode and use {{NetworkTopology.getDistance(node1, node2)}} to get the 
distance (which also calculates the distance by summing up the nodes distances 
to their closest common ancestor), but both of these methods use 
{{ReadWriteLock.readLock()}} which might again impact the  performance. 

Any comments on using {{NetworkTopology.contains(node)}} to check and use 
{{NetworkTopology.getDistance(node1, node2)}} to get the distance in case if 
the reader is an off rack datanode?

I'm currently working on the benchmarking, will update it once it's done.

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-17 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674317#comment-15674317
 ] 

Ming Ma commented on HDFS-10206:


bq. that is why getDistanceUsingNetworkLocation is called only when the 
conditions reader.equals(node) and isOnSameRack(reader, node) are not satisfied.
There are two scenarios this new function will be called. one is the reader 
being a datanode in a remote rack in a large cluster; for that NetworkTopology 
already has the reader in its tree, it will be faster to compare parents 
reference. Another one is the reader being a non-datanode, the new function 
will be useful here. Do you have any micro benchmark?
bq. With this patch it will be 0 for local, 1 for same rack and after that the 
value is incremented by 1 for each level.
>From the below code, it seems each level will increase by 2.
{noformat}
  weight = (path1Token.length - currentLevel) +
  (path2Token.length - currentLevel);
{noformat}



> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-17 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673673#comment-15673673
 ] 

Nandakumar commented on HDFS-10206:
---

Thanks for the review [~mingma].

{quote}
When the conditions {{reader.equals(node) & isOnSameRack(reader, node) }} 
aren't satisfied, this patch will cause extra string parsing. Wonder if there 
is any major performance impact. If that isn't an issue, can 
getDistanceUsingNetworkLocation handle all scenarios including 
{{reader.equals(node) & isOnSameRack(reader, node) }}?
{quote}
I was also worried about the performance impact that will be caused by extra 
string parsing, that is why {{getDistanceUsingNetworkLocation}} is called only 
when the conditions {{reader.equals(node)}} and {{isOnSameRact(reader, node)}} 
are not satisfied. 

{quote}
It probably doesn't matter much. getWeight used to return 0, 1, 2, 3, etc. as 
network layer increases. With the patch it changes to 0, 1, 2, 4, etc..
{quote}
I didn't quite understand this point. Previously {{getWeight}} used to return 0 
for local, 1 for same rack and 2 for off rack. With this patch it will be 0 for 
local, 1 for same rack and after that the value is incremented by 1 for each 
level

> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-16 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673058#comment-15673058
 ] 

Ming Ma commented on HDFS-10206:


Thank [~nandakumar131]!

* When the conditions {{reader.equals(node) & isOnSameRack(reader, node) }} 
aren't satisfied, this patch will cause extra string parsing. Wonder if there 
is any major performance impact. If that isn't an issue, can 
getDistanceUsingNetworkLocation handle all scenarios including 
{{reader.equals(node) & isOnSameRack(reader, node) }}?
* It probably doesn't matter much. {{getWeight}} used to return 0, 1, 2, 3, 
etc. as network layer increases. With the patch it changes to 0, 1, 2, 4, etc..


> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance

2016-11-09 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651089#comment-15651089
 ] 

Nandakumar commented on HDFS-10206:
---


{{NetworkTopology#sortByDistance}} uses {{NetworkTopology#getWeight}} to 
calculate the distance between reader and node. Additional logic is added in 
{{NetworkTopology#getWeight}} to calculate the distance based on 
networkLocation of reader and the node when the following conditions are not 
satisfy 
bq. reader.equals(node)  & isOnSameRack(reader, node)

This will work for DFSClient machine which is not a datanode, since the 
distance calculation depends on networkLocation and not the parent Node.

Please review the patch.

Thanks,
Nanda


> getBlockLocations might not sort datanodes properly by distance
> ---
>
> Key: HDFS-10206
> URL: https://issues.apache.org/jira/browse/HDFS-10206
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Nandakumar
> Attachments: HDFS-10206.000.patch
>
>
> If the DFSClient machine is not a datanode, but it shares its rack with some 
> datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} 
> might not put the local-rack datanodes at the beginning of the sorted list. 
> That is because the function didn't call {{networktopology.add(client);}} to 
> properly set the node's parent node; something required by 
> {{networktopology.sortByDistance}} to compute distance between two nodes in 
> the same topology tree.
> Another issue with {{networktopology.sortByDistance}} is it only 
> distinguishes local rack from remote rack, but it doesn't support general 
> distance calculation to tell how remote the rack is.
> {noformat}
> NetworkTopology.java
>   protected int getWeight(Node reader, Node node) {
> // 0 is local, 1 is same rack, 2 is off rack
> // Start off by initializing to off rack
> int weight = 2;
> if (reader != null) {
>   if (reader.equals(node)) {
> weight = 0;
>   } else if (isOnSameRack(reader, node)) {
> weight = 1;
>   }
> }
> return weight;
>   }
> {noformat}
> HDFS-10203 has suggested moving the sorting from namenode to DFSClient to 
> address another issue. Regardless of where we do the sorting, we still need 
> fix the issues outline here.
> Note that BlockPlacementPolicyDefault shares the same NetworkTopology object 
> used by DatanodeManager and requires Nodes stored in the topology to be 
> {{DatanodeDescriptor}} for block placement. So we need to make sure we don't 
> pollute the  NetworkTopology if we plan to fix it on the server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org