[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-16 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439275#comment-16439275
 ] 

genericqa commented on HDFS-13279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 31m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 30m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m  4s{color} | {color:orange} root: The patch generated 9 new + 39 unchanged - 
1 fixed = 48 total (was 40) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
42s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}264m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.blockmanagement.TestWeightedRackBlockPlacementPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HDFS-13279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12919168/HDFS-13279.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f8a6fe236faf 

[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-16 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439024#comment-16439024
 ] 

Tao Jie commented on HDFS-13279:


Updated the patch. With the commit in HDFS-13418, we have not to change the 
current logic in the latest attached patch.

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch, HDFS-13279.004.patch, HDFS-13279.005.patch, 
> HDFS-13279.006.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-10 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432482#comment-16432482
 ] 

Tao Jie commented on HDFS-13279:


Thank you [~ajayydv] for reply!
1, I set up another jira HDFS-13418, and I will try not to modify any current 
logic.
2, I try to make my point clear. There are 2 approaches to choose the right 
node:
# As you mentioned, first choose the right rack with weight, then choose a node 
from the selected rack.
# As the code in patch,  first choose a node as it used to, then check if it 
need another choosing according to the weight of the rack.

I understand approach 1 could be more accurate, also approach 2 works well as I 
have tested. I prefer approach 2 because it reuse rather than rewrite the 
current logic of choosing nodes.
For the consideration of performance, approach 2 do have some overhead. In a 
certain proportion of choosing, we may choose twice. But I don't think the 
chance will be over 10% in a typical case. For approach 1, I think we should 
have more test for the performance since the choosing logic is totally changed 
and it seems to be heavier than the former logic.


> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch, HDFS-13279.004.patch, HDFS-13279.005.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-10 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431772#comment-16431772
 ] 

Ajay Kumar commented on HDFS-13279:
---

{quote}HDFS-11998 sets DFSNetworkTopology as default topology implementation 
even though net.topology.impl is set to NetworkTopology. In HDFS-11530, once 
dfs.use.dfs.network.topology is true, the implementation is hard code to 
DFSNetworkTopology no matter what net.topology.impl is. So we have to modify 
the behavior if we need to add a new topology implementation and let it work. 
Maybe we could fix it in another Jira?{quote}
This is the reason i suggested modifying {{DatanodeManager#init}}, may be can 
make it generic enough to handle similar scnerios in future.I think its ok to 
address it in another jira but then we should not modify the current default 
behaviour as new implementation is not tested at scale.

{quote}It is OK if we use first choose a rack then choose a node logic in 
chooseRandom. The purpose of a twice choosing is to mostly reuse the current 
choosing logic, which make the code more easier.{quote}
Sorry, do not understand your point clearly on this one. I think choosing node 
twice without considering weight increases the probability wrong node being 
selected first time which makes it little costlier compared to choosing rack 
initially based on rack weight. 



> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch, HDFS-13279.004.patch, HDFS-13279.005.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-05 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427895#comment-16427895
 ] 

Tao Jie commented on HDFS-13279:


[~ajayydv] Thank you for your comments.
{quote}
Don't remove the default impl entry from core-default.xml.
Similarly we can do this with almost no change in  Network Topology, 
DFSNetworkTopology and TestBalancerWithNodeGroup. We can  update 
DatanodeManager#init to instantiate according to value of "net.topology.impl".
{quote}
Actually in original patch based on 2.8.2, we don't need to modify 
core-default.xml, Network Topology, TestBalancerWithNodeGroup. It is a little 
tricky in HDFS-11998 which set DFSNetworkTopology as default topology 
implementation even though {{net.topology.impl}} is set to NetworkTopology. In 
HDFS-11530, once {{dfs.use.dfs.network.topology}} is true, the implementation 
is hard code to {{DFSNetworkTopology}} no matter what {{net.topology.impl}} is. 
So we have to modify the behavior if we need to add a new topology 
implementation and let it work. Maybe we could fix it in another Jira?
{quote}
L43, chooseDataNode: Instead of choosing datanode twice we can just call 
super.chooseRandom if we overide chooseRandom in 
NetworkTopologyWithWeightedRack. This way we can avoid calling chooseRandom 
twice.
{quote}
It is OK if we use {{first choose a rack then choose a node}} logic in 
{{chooseRandom}}. The purpose of a twice choosing is to mostly reuse the 
current choosing logic, which make the code more easier:)

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch, HDFS-13279.004.patch, HDFS-13279.005.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-05 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427301#comment-16427301
 ] 

Ajay Kumar commented on HDFS-13279:
---

[~Tao Jie], Thanks for updating the patch. Few comments 
 * We should try to avoid any change in default behaviour. 
 ** Don't remove the default impl entry from core-default.xml.
 ** Similarly we can do this with almost no change in  Network Topology, 
DFSNetworkTopology and TestBalancerWithNodeGroup. We can  update 
DatanodeManager#init to instantiate according to value of "net.topology.impl".
 * NetworkTopologyWithWeightedRack
 ** Shall we override chooseRandom & chooseRandomWithStorageType to  select 
next random node utilizing weights of rack. i.e first select rack according to 
weighted probability and then select a node in that rack.
 *** choose rack according to weighted probability and excluded nodes.
 *** choose random node from rack selected on first step.
 *** repeat until a node is found or all racks in scope are tried at least once.
 ** L29, shall we simplify javadoc for class to something like "The class 
extends NetworkTopology to represents racks weighted according to number of 
datanodes."
 ** L38 LimitedPrivate to Private
 * WeightedRackBlockPlacementPolicy
 ** L43, chooseDataNode: Instead of choosing datanode twice we can just call 
super.chooseRandom if we overide chooseRandom in 
NetworkTopologyWithWeightedRack. This way we can avoid calling chooseRandom 
twice.
 ** Private and Unstable annotation for class

 

 

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch, HDFS-13279.004.patch, HDFS-13279.005.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-04-04 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425385#comment-16425385
 ] 

genericqa commented on HDFS-13279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 11s{color} | {color:orange} root: The patch generated 9 new + 61 unchanged - 
1 fixed = 70 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 25s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
2s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 41s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}240m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.blockmanagement.TestWeightedRackBlockPlacementPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HDFS-13279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917511/HDFS-13279.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux f47fbbb575ba 3.13.0-143-generic #192-Ubuntu 

[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-29 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419647#comment-16419647
 ] 

genericqa commented on HDFS-13279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m  7s{color} | {color:orange} root: The patch generated 12 new + 39 unchanged 
- 1 fixed = 51 total (was 40) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 42s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}224m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.shell.TestCopyFromLocal |
|   | hadoop.conf.TestCommonConfigurationFields |
|   | hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HDFS-13279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916849/HDFS-13279.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux b5f802ae4c2e 3.13.0-141-generic 

[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-26 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413612#comment-16413612
 ] 

Tao Jie commented on HDFS-13279:


Thank you [~ajayydv] for your comments.
{quote}
Instead if modifying current default placement policy we should create new one 
by extending DefaultBlockPlacementPolicy/BlockPlacementPolicy.
{quote}
Agree! I will try to add a new blockPlacementPolicy in the next patch. However 
we may still modify {{NetworkTopology}}, in which we calculate weight for each 
rack according node number of each rack. We'd better do the calculation when 
adding/removing nodes rather than choosing nodes for blocks.
 

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-23 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411752#comment-16411752
 ] 

Ajay Kumar commented on HDFS-13279:
---

[~Tao Jie], thanks for working on this. This will be a good improvement for 
clusters with different no of DNs per rack.  You might have already considered 
this but i think it is best to handle this as part of new placement policy. 
Below are some thoughts on this:
 * Instead of optimizing just for small rack it will be best to choose next 
rack according to weighted probability of racks. (computed according to no of 
DataNodes per rack)
 * Instead if modifying current default placement policy we should create new 
one by extending DefaultBlockPlacementPolicy/BlockPlacementPolicy.

 

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch, HDFS-13279.002.patch, 
> HDFS-13279.003.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-22 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409445#comment-16409445
 ] 

genericqa commented on HDFS-13279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 27s{color} | {color:orange} root: The patch generated 13 new + 150 unchanged 
- 0 fixed = 163 total (was 150) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 24s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 28s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}197m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.util.TestDiskChecker |
|   | hadoop.util.TestReadWriteDiskValidator |
|   | hadoop.hdfs.server.blockmanagement.TestBlockPlacementPolicyWithSmallRack |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | HDFS-13279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12915621/HDFS-13279.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a6daca5bbef0 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 

[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-21 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408379#comment-16408379
 ] 

genericqa commented on HDFS-13279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
13s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  7m  
8s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 15m 31s{color} 
| {color:red} root generated 187 new + 1085 unchanged - 0 fixed = 1272 total 
(was 1085) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 47s{color} | {color:orange} root: The patch generated 13 new + 150 unchanged 
- 0 fixed = 163 total (was 150) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 59s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m  7s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}204m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.util.TestDiskChecker |
|   | hadoop.util.TestReadWriteDiskValidator |
|   | hadoop.hdfs.server.blockmanagement.TestBlockPlacementPolicyWithSmallRack |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | HDFS-13279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12915499/HDFS-13279.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8ba343b17712 4.4.0-89-generic 

[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-21 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407762#comment-16407762
 ] 

genericqa commented on HDFS-13279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-13279 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12915451/HDFS-13279.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23608/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-21 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407728#comment-16407728
 ] 

Tao Jie commented on HDFS-13279:


Updated patch and add a flag in 
configuration:{{net.topology.modify.small.rack}}. With this option on, it would 
add a modifier to the probability of choosing nodes in small rack.
I have made some test about this modifier:
||TotalNodes||Racks||NodesPerNormalRack||NodesPerSmallRack||Probalility to 
Small Rack without the modifier|| Probalility to Small Rack with the 
modifier(better close to 1)||
|35|3|15|5|1.334|1.103|
|50|4|15|5|1.189|1.023|
|65|5|15|5|1.114|1.035|
|140|10|15|5|1.087|1.014|
|95|4|30|5|1.247|1.030|
|155|4|50|5|1.288|1.030|
|455|10|50|5|1.108|1.014|


> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Priority: Major
> Attachments: HDFS-13279.001.patch
>
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal

2018-03-15 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400065#comment-16400065
 ] 

Tao Jie commented on HDFS-13279:


It seems not easy to make probability absolutely equal for each node, but we 
can add a modifier to those nodes in smaller rack and avoid them from being 
written too frequently.

Any thought?

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> --
>
> Key: HDFS-13279
> URL: https://issues.apache.org/jira/browse/HDFS-13279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Tao Jie
>Priority: Major
>
> In a Hadoop cluster, number of nodes on a rack could be different. For 
> example, we have 50 Datanodes in all and 15 datanodes per rack, it would 
> remain 5 nodes on the last rack. In this situation, we find that storage 
> usage on the last 5 nodes would be much higher than other nodes.
>  With the default blockplacement policy, for each block, the first 
> replication has the same probability to write to each datanode, but the 
> probability for the 2nd/3rd replication to write to the last 5 nodes would 
> much higher than to other nodes. 
>  Consider we write 50 blocks to such 50 datanodes. The first rep of 100 block 
> would distirbuted to 50 node equally. The 2rd rep of blocks which the 1st rep 
> is on rack1(15 reps) would send equally to other 35 nodes and each nodes 
> receive 0.428 rep. So does blocks on rack2 and rack3. As a result, node on 
> rack4(5 nodes) would receive 1.29 replications in all, while other node would 
> receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org