[ https://issues.apache.org/jira/browse/HDFS-102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J resolved HDFS-102. -------------------------- Resolution: Cannot Reproduce This has gone stale. The current structure within BlockManager isn't a list anymore, and we haven't seen this kinda behavior in quite a while. > high cpu usage in ReplicationMonitor thread > -------------------------------------------- > > Key: HDFS-102 > URL: https://issues.apache.org/jira/browse/HDFS-102 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Koji Noguchi > > We had a namenode stuck in CPU 99% and it was showing a slow response time. > (dfs.namenode.handler.count was still set to 10.) > ReplicationMonitor thread was using the most CPU time. > Jstack showed, > "org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor@1c7b0f4d" daemon > prio=10 tid=0x0000002d90690800 nid=0x4855 runnable > [0x0000000041941000..0x0000000041941b30] > java.lang.Thread.State: RUNNABLE > at java.util.AbstractList$Itr.remove(AbstractList.java:360) > at > org.apache.hadoop.dfs.FSNamesystem.blocksToInvalidate(FSNamesystem.java:2475) > - locked <0x0000002a9f522038> (a org.apache.hadoop.dfs.FSNamesystem) > at > org.apache.hadoop.dfs.FSNamesystem.computeDatanodeWork(FSNamesystem.java:1775) > at > org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor.run(FSNamesystem.java:1713) > at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira