[ 
https://issues.apache.org/jira/browse/HDFS-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384225#comment-14384225
 ] 

Chris Nauroth commented on HDFS-8000:
-------------------------------------

Hi [~brahmareddy].  I looked at the jstack output.  A lot of threads are 
blocked on monitor 0x00007f8a10f1ce18.  See below for the stack trace of the 
thread that currently holds that lock.

It appears that you have debug logging enabled for block state changes.  Is 
that right?  If so, then this might be sluggish logging rather than a true 
deadlock.  Perhaps the logging could be moved outside of the lock, similar to 
what we've done in other recent patches.

{code}
"org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@65cd4995"
 daemon prio=10 tid=0x00007f85d4119000 nid=0x2b8e5 runnable [0x00007f85daa6e000]
   java.lang.Thread.State: RUNNABLE
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:345)
        at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
        at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
        at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
        at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
        - locked <0x00007f88f906e410> (a java.io.OutputStreamWriter)
        at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
        at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
        at 
org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
        at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
        - locked <0x00007f8a11213588> (a org.apache.log4j.RollingFileAppender)
        at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        - locked <0x00007f8a1100bf20> (a org.apache.log4j.spi.RootLogger)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.log(Category.java:856)
        at org.slf4j.impl.Log4jLoggerAdapter.debug(Log4jLoggerAdapter.java:209)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.UnderReplicatedBlocks.update(UnderReplicatedBlocks.java:291)
        - locked <0x00007f8a10f1ce18> (a 
org.apache.hadoop.hdfs.server.blockmanagement.UnderReplicatedBlocks)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.updateNeededReplications(BlockManager.java:3292)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeStoredBlock(BlockManager.java:2948)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlocksAssociatedTo(BlockManager.java:1057)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDatanode(DatanodeManager.java:535)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:577)
        - locked <0x00007f8a11b9fa68> (a java.util.TreeMap)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager.heartbeatCheck(HeartbeatManager.java:323)
        - locked <0x00007f8a11340348> (a 
org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor.run(HeartbeatManager.java:358)
        at java.lang.Thread.run(Thread.java:745)
{code}

> Potential deadlock #HeartbeatManager
> ------------------------------------
>
>                 Key: HDFS-8000
>                 URL: https://issues.apache.org/jira/browse/HDFS-8000
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>         Attachments: NNtd.out
>
>
> Cluster loaded with 90000000+ Blocks
> Restart DN
>  access NN UI..Then NN will  got Hang
> Will attach td..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to