[jira] [Commented] (HDFS-7815) Loop on 'blocks does not belong to any file'

Chris Nauroth (JIRA) Fri, 20 Feb 2015 09:47:20 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329249#comment-14329249
 ]


Chris Nauroth commented on HDFS-7815:
-------------------------------------

Hello, [~frha].

This bug was fixed in HDFS-7503 by moving this logging outside of the 
namesystem write lock, so even if there is a large volume of this logging, 
other NameNode threads can still make progress.  The fix is targeted to Apache 
Hadoop 2.6.1 and 2.7.0, both still awaiting release.  In the meantime, a known 
workaround is to edit log4j.properties to tune down the logger level to WARN.  
Of course, this will have the side effect of suppressing these log messages 
entirely.

I'm resolving this issue as duplicate.

> Loop on 'blocks does not belong to any file'
> --------------------------------------------
>
>                 Key: HDFS-7815
>                 URL: https://issues.apache.org/jira/browse/HDFS-7815
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>         Environment: small cluster on RetHat. 2 namenodes (HA),  6 datanodes 
> with 19TB disk for hdfs.
>            Reporter: Frode Halvorsen
>
> I am currently experincing a looping situation;
> The namenode uses appx 1:50 (min:sec) to log a massive amount of lines 
> stating that some blocks don't belong to any file. During this time, it's 
> unresponsive to any requests from datanodes, and if the zoo-keper had been 
> running, it would have taken the name-node down (ssh-fencing : kill).
> When it has finished the 'round', it starts to do some normal work, and among 
> other things, telling the datanode to delete the blocks. But before the 
> datanode has gotten around to delete the blocks, and is about to report back 
> to the namenode, the namenode  has stared on the next round of reporing the 
> same blocks that don't belong to anly file. Thus, the datanode gets a timout 
> when reporing block-updates for the deleted blocks, And this, of course 
> repeats itself over and over again... 
> There is actually two issues , I think,;
> 1- the namenode gets totally unresponsive when reporing the blocks (could 
> this be a debug-line instead of a INFO-line)
> 2 - the namenode seems to 'forget' that it has already reported those blocks 
> just 2-3 minutes ago...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7815) Loop on 'blocks does not belong to any file'

Reply via email to