[jira] [Commented] (HDFS-2114) re-commission of a decommissioned node does not delete excess replica

Matt Foley (JIRA) Wed, 13 Jul 2011 00:44:46 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064409#comment-13064409
 ]


Matt Foley commented on HDFS-2114:
----------------------------------

Hi John, 
TestDecommission.java:

1. I know you didn't write TestDecommission.checkFile(), but your addition in 
this patch of the {{isNodeDown}} flag raises a number of questions:
* Aren't (isNodeDown && nodes[j].getName().equals(downnode)) and 
(nodes[j].getName().equals(downnode)) always the same?  So is the first use of 
{{isNodeDown}} even necessary?
* Can there be any cases where (nodes[j].getName().equals(downnode)) and 
(nodes[j].isDecommissioned()) are different? So can't the following two blocks 
be merged?  And the location of the "is decommissioned" log message further 
confuses this issue.
{code}
        if (isNodeDown && nodes[j].getName().equals(downnode)) {
          hasdown++;
          LOG.info("Block " + blk.getBlock() + " replica " + nodes[j].getName()
              + " is decommissioned.");
        }
        if (nodes[j].isDecommissioned()) {
          if (firstDecomNodeIndex == -1) {
            firstDecomNodeIndex = j;
          }
          continue;
        }
{code}
* What is the purpose of the assertion
{code}
        if(isNodeDown)
          assertEquals("Decom node is not at the end", firstDecomNodeIndex, -1);
{code}
And why does it even work, since the node to decommission is chosen at random?
* And in the same block, why is it important to condition it on {{isNodeDown}}, 
since (!isNodeDown) implies there shouldn't be any decommissioned nodes?  So 
the second use of {{isNodeDown}} also seems unnecessary.

2. Regarding the timeout:  In TestDecommission, you appropriately set 
BLOCKREPORT_INTERVAL_MSEC down to 1 sec to match the HEARTBEAT_INTERVAL.  You 
may also want to consider DFS_NAMENODE_REPLICATION_INTERVAL_KEY (default 3 
sec).  Adjusting this value to 1 might allow testRecommission() to run in 5 sec 
instead of 10.

3. I wish there were a clear way to share the duplicated code between 
testDecommission and testRecommission, but I couldn't see an obviously better 
choice.  Can you think of a way to improve this?

FSNamesystem and BlockManager:

The implementation looks okay to me.

The failure of unit test TestTrash.testTrashEmptier does not seem to be related 
to this change.  It is probably related to HDFS-7326, although the symptoms 
reported are slightly different.

> re-commission of a decommissioned node does not delete excess replica
> ---------------------------------------------------------------------
>
>                 Key: HDFS-2114
>                 URL: https://issues.apache.org/jira/browse/HDFS-2114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: John George
>            Assignee: John George
>         Attachments: HDFS-2114-2.patch, HDFS-2114-3.patch, HDFS-2114.patch
>
>
> If a decommissioned node is removed from the decommissioned list, namenode 
> does not delete the excess replicas it created while the node was 
> decommissioned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2114) re-commission of a decommissioned node does not delete excess replica

Reply via email to