Yiqun Lin created HDFS-10803:
--------------------------------
Summary:
TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails
intermittently due to no free space available
Key: HDFS-10803
URL: https://issues.apache.org/jira/browse/HDFS-10803
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Yiqun Lin
Assignee: Yiqun Lin
The test {{TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools}}
fails intermittently. The stack
infos(https://builds.apache.org/job/PreCommit-HDFS-Build/16534/testReport/org.apache.hadoop.hdfs.server.balancer/TestBalancerWithMultipleNameNodes/testBalancing2OutOf3Blockpools/):
{code}
java.io.IOException: Creating block, no free space available
at
org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset$BInfo.<init>(SimulatedFSDataset.java:151)
at
org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.injectBlocks(SimulatedFSDataset.java:580)
at
org.apache.hadoop.hdfs.MiniDFSCluster.injectBlocks(MiniDFSCluster.java:2679)
at
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.unevenDistribution(TestBalancerWithMultipleNameNodes.java:405)
at
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancing2OutOf3Blockpools(TestBalancerWithMultipleNameNodes.java:516)
{code}
The error message means that the datanode's capacity has used up and there is
no other space to create a new file block.
I looked into the code, I found the main reason seemed that the {{capacities}}
for cluster is not correctly constructed in the second cluster startup before
preparing to redistribute blocks in test.
The related code:
{code}
// Here we do redistribute blocks nNameNodes times for each node,
// we need to adjust the capacities. Otherwise it will cause the no
// free space errors sometimes.
final MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
.nnTopology(MiniDFSNNTopology.simpleFederatedTopology(nNameNodes))
.numDataNodes(nDataNodes)
.racks(racks)
.simulatedCapacities(newCapacities)
.format(false)
.build();
LOG.info("UNEVEN 11");
...
for(int n = 0; n < nNameNodes; n++) {
// redistribute blocks
final Block[][] blocksDN = TestBalancer.distributeBlocks(
blocks[n], s.replication, distributionPerNN);
for(int d = 0; d < blocksDN.length; d++)
cluster.injectBlocks(n, d, Arrays.asList(blocksDN[d]));
LOG.info("UNEVEN 13: n=" + n);
}
{code}
And that means the totalUsed value has been increased as
{{nNameNodes*usedSpacePerNN}} rather than {{usedSpacePerNN}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]