Wei-Chiu Chuang created HDFS-9599:
-------------------------------------
Summary: TestDecommissioningStatus.testDecommissionStatus
occasionally fails
Key: HDFS-9599
URL: https://issues.apache.org/jira/browse/HDFS-9599
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Environment: Jenkins
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang
>From test result of a recent jenkins nightly
>https://builds.apache.org/job/Hadoop-Hdfs-trunk/2663/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestDecommissioningStatus/testDecommissionStatus/
The test failed because the number of under replicated blocks is 4, instead of
3.
Looking at the log, there is a strayed block, which might have caused the
faillure:
{noformat}
2015-12-23 00:42:05,820 [Block report processor] INFO BlockStateChange
(BlockManager.java:processReport(2131)) - BLOCK* processReport:
blk_1073741825_1001 on node 127.0.0.1:57382 size 16384 does not belong to any
file
{noformat}
The block size 16384 suggests this is left over from the sibling test case
testDecommissionStatusAfterDNRestart. This can happen, because the same minidfs
cluster is reused between tests.
The test implementation should do a better job isolating tests.
Another case of failure is when the load factor comes into play, and a block
can not find sufficient data nodes to place replica. In this test, the runtime
should not consider load factor:
{noformat}
conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, false);
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)