[
https://issues.apache.org/jira/browse/HDFS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064409#comment-13064409
]
Matt Foley commented on HDFS-2114:
----------------------------------
Hi John,
TestDecommission.java:
1. I know you didn't write TestDecommission.checkFile(), but your addition in
this patch of the {{isNodeDown}} flag raises a number of questions:
* Aren't (isNodeDown && nodes[j].getName().equals(downnode)) and
(nodes[j].getName().equals(downnode)) always the same? So is the first use of
{{isNodeDown}} even necessary?
* Can there be any cases where (nodes[j].getName().equals(downnode)) and
(nodes[j].isDecommissioned()) are different? So can't the following two blocks
be merged? And the location of the "is decommissioned" log message further
confuses this issue.
{code}
if (isNodeDown && nodes[j].getName().equals(downnode)) {
hasdown++;
LOG.info("Block " + blk.getBlock() + " replica " + nodes[j].getName()
+ " is decommissioned.");
}
if (nodes[j].isDecommissioned()) {
if (firstDecomNodeIndex == -1) {
firstDecomNodeIndex = j;
}
continue;
}
{code}
* What is the purpose of the assertion
{code}
if(isNodeDown)
assertEquals("Decom node is not at the end", firstDecomNodeIndex, -1);
{code}
And why does it even work, since the node to decommission is chosen at random?
* And in the same block, why is it important to condition it on {{isNodeDown}},
since (!isNodeDown) implies there shouldn't be any decommissioned nodes? So
the second use of {{isNodeDown}} also seems unnecessary.
2. Regarding the timeout: In TestDecommission, you appropriately set
BLOCKREPORT_INTERVAL_MSEC down to 1 sec to match the HEARTBEAT_INTERVAL. You
may also want to consider DFS_NAMENODE_REPLICATION_INTERVAL_KEY (default 3
sec). Adjusting this value to 1 might allow testRecommission() to run in 5 sec
instead of 10.
3. I wish there were a clear way to share the duplicated code between
testDecommission and testRecommission, but I couldn't see an obviously better
choice. Can you think of a way to improve this?
FSNamesystem and BlockManager:
The implementation looks okay to me.
The failure of unit test TestTrash.testTrashEmptier does not seem to be related
to this change. It is probably related to HDFS-7326, although the symptoms
reported are slightly different.
> re-commission of a decommissioned node does not delete excess replica
> ---------------------------------------------------------------------
>
> Key: HDFS-2114
> URL: https://issues.apache.org/jira/browse/HDFS-2114
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: John George
> Assignee: John George
> Attachments: HDFS-2114-2.patch, HDFS-2114-3.patch, HDFS-2114.patch
>
>
> If a decommissioned node is removed from the decommissioned list, namenode
> does not delete the excess replicas it created while the node was
> decommissioned.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira