JichengSong created HDFS-7592:
---------------------------------
Summary: A bug in BlocksMap that cause NameNode memory leak.
Key: HDFS-7592
URL: https://issues.apache.org/jira/browse/HDFS-7592
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 0.21.0
Environment: HDFS-0.21.0
Reporter: JichengSong
Assignee: JichengSong
In our HDFS production environment, NameNode FGC frequently after running for 2
months, we have to restart NameNode manually.
We dumped NameNode's Heap for objects statistics.
Before restarting NameNode:
num #instances #bytes class name
----------------------------------------------
1: 59262275 3613989480 [Ljava.lang.Object;
...
10: 8549361 615553992
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
11: 5941511 427788792
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
After restarting NameNode:
num #instances #bytes class name
----------------------------------------------
1: 44188391 2934099616 [Ljava.lang.Object;
...
23: 721763 51966936
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction
24: 620028 44642016
org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
We find the number of BlockInfoUnderConstruction is abnormally large before
restarting NameNode.
As we know, BlockInfoUnderConstruction keeps block state when the file is being
written. But the write pressure of
our cluster is far less than million/sec. We think there is a memory leak in
NameNode.
We fixed the bug as followsing patch.
--- src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java (版本
1640066)
+++ src/java/org/apache/hadoop/hdfs/server/namenode/BlocksMap.java (工作副本)
@@ -205,6 +205,8 @@
DatanodeDescriptor dn = currentBlock.getDatanode(idx);
dn.replaceBlock(currentBlock, newBlock);
}
+ // change to fix bug about memory leak of NameNode
+ map.remove(newBlock);
// replace block in the map itself
map.put(newBlock, newBlock);
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)