[
https://issues.apache.org/jira/browse/HDFS-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022194#comment-16022194
]
Lei (Eddy) Xu commented on HDFS-11682:
--------------------------------------
In {{TestBalancer#testBalancerWithStripedFile}}, it creates a file with 72
data blocks (20 * 12 * 3 / 10). Under RS(6, 3) coding, it is 72 / 6 * (6 + 3) =
108 blocks.
And after
{code}
// add datanodes in new rack
String newRack = "/rack" + (++numOfRacks);
cluster.startDataNodes(conf, 2, true, null,
new String[]{newRack, newRack}, null,
new long[]{capacity, capacity});
{code}
There are 14 DataNodes before running {{Balancer}}.
With some additional debug log, the log shows
{code}
17-05-23 18:17:23,186 [Thread-0] INFO balancer.Balancer
(Balancer.java:init(380)) - Above avg: 127.0.0.1:60027:DISK, util=50.000000,
avg=40.000000, diff=10.000000, threshold=10.000000
{code}
So this DataNode: {{127.0.0.1:60027}} is not chose to be the source for
balancing, because {{50.0 - 40.0 <= 10.0}}. But the actual average utilization
is {{108 / 14 * 20}} = 38.57. Thus in {{TestBalancer#waitForBalancer}}, it
will fail because {{50.0 - 38.57 > 10.0}}. This is due to that NN has not
receive a new block report to reflect the moved blocks.
> TestBalancer#testBalancerWithStripedFile is flaky
> -------------------------------------------------
>
> Key: HDFS-11682
> URL: https://issues.apache.org/jira/browse/HDFS-11682
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: test
> Affects Versions: 3.0.0-alpha3
> Reporter: Andrew Wang
> Assignee: Lei (Eddy) Xu
> Attachments: IndexOutOfBoundsException.log, timeout.log
>
>
> Saw this fail in two different ways on a precommit run, but pass locally.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]