[jira] [Commented] (HDFS-7815) Loop on 'blocks does not belong to any file'

Chris Nauroth (JIRA) Thu, 19 Mar 2015 11:14:12 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369836#comment-14369836
 ]


Chris Nauroth commented on HDFS-7815:
-------------------------------------

Hi, [~frha].  You can add this line to your log4j.properties to suppress the 
block state change logging:

{code}
log4j.logger.BlockStateChange=WARN
{code}

However, if you're running a distro based on Apache Hadoop 2.6.0, then that 
version has a bug that accidentally changed the routing of these log messages.  
This was fixed in HDFS-7425, so subsequent versions won't have this problem.  
If you're running that version and the above doesn't work, then you can do this 
instead:

{code}
log4j.logger.org.apache.hadoop.hdfs.StateChange=WARN
{code}


> Loop on 'blocks does not belong to any file'
> --------------------------------------------
>
>                 Key: HDFS-7815
>                 URL: https://issues.apache.org/jira/browse/HDFS-7815
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>         Environment: small cluster on RetHat. 2 namenodes (HA),  6 datanodes 
> with 19TB disk for hdfs.
>            Reporter: Frode Halvorsen
>
> I am currently experincing a looping situation;
> The namenode uses appx 1:50 (min:sec) to log a massive amount of lines 
> stating that some blocks don't belong to any file. During this time, it's 
> unresponsive to any requests from datanodes, and if the zoo-keper had been 
> running, it would have taken the name-node down (ssh-fencing : kill).
> When it has finished the 'round', it starts to do some normal work, and among 
> other things, telling the datanode to delete the blocks. But before the 
> datanode has gotten around to delete the blocks, and is about to report back 
> to the namenode, the namenode  has stared on the next round of reporing the 
> same blocks that don't belong to anly file. Thus, the datanode gets a timout 
> when reporing block-updates for the deleted blocks, And this, of course 
> repeats itself over and over again... 
> There is actually two issues , I think,;
> 1- the namenode gets totally unresponsive when reporing the blocks (could 
> this be a debug-line instead of a INFO-line)
> 2 - the namenode seems to 'forget' that it has already reported those blocks 
> just 2-3 minutes ago...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7815) Loop on 'blocks does not belong to any file'

Reply via email to