[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

Tsz Wo (Nicholas), SZE (JIRA) Mon, 16 May 2011 17:01:32 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034436#comment-13034436
 ]


Tsz Wo (Nicholas), SZE commented on HDFS-1332:
----------------------------------------------

- Just found one problem: In the two {{chooseRandom(..)}} methods (line 333-363 
and line 378-422), if the first {{FSNamesystem.LOG.isDebugEnabled()}} return 
false and the second {{FSNamesystem.LOG.isDebugEnabled()}} returns true, we 
will have a {{NullPointerException}} since {{builder}} is null.  We should 
check null for the second if-statement.

- There are repeated "Not able to place enough replicas" in the log message.  I 
think we might have {{new NotEnoughReplicasException(detail)}} and then use 
{{"\n" + e}} instead of {{"\n" + e.getMessage()}} in {{LOG.warn(..)}}.  Then, 
the log will become something like
{noformat}
2011-05-16 16:37:04,230 WARN  namenode.FSNamesystem 
(BlockPlacementPolicyDefault.java:chooseTarget(212)) - Not able to place enough 
replicas, still in need of 1 to reach 2
NotEnoughReplicasException: [127.0.0.1:49864: Node 
/default-rack/127.0.0.1:49864 is not chosen because the node is (being) 
decommissioned ]
{noformat}

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1332
>                 URL: https://issues.apache.org/jira/browse/HDFS-1332
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Ted Yu
>            Priority: Minor
>              Labels: newbie
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1332-concise.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

Reply via email to