[
https://issues.apache.org/jira/browse/HADOOP-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618465#action_12618465
]
lohit edited comment on HADOOP-3865 at 7/30/08 11:48 AM:
--------------------------------------------------------------------
This was seen while running secondary namenode with heap of 1.5G and an image
of 300MB (on disk). At each checkpoint heap usage kept of increasing for SN,
even force GC did not free the space. After about 4 checkpoints SN crashed with
OutOfMemory exception.
{noformat}
2008-07-30 18:41:01,527 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Throwable Exception
in doCheckpoint:
2008-07-30 18:41:01,528 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.HashMap.addEntry(HashMap.java:753)
at java.util.HashMap.put(HashMap.java:385)
at
org.apache.hadoop.hdfs.server.namenode.BlocksMap.checkBlockInfo(BlocksMap.java:302)
at
org.apache.hadoop.hdfs.server.namenode.BlocksMap.addINode(BlocksMap.java:316)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addToParent(FSDirectory.java:238)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:819)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:571)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:468)
{noformat}
At each checkpoint, I see that FSNamesystem object count incremented and not
released.
{noformat}
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
254: 1 288 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
550: 1 56
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
881: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
193: 2 576 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
419: 2 112
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
879: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
154: 3 864 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
343: 3 168
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
891: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
122: 4 1152 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
294: 4 224
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
860: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$
{noformat}
was (Author: lohit):
This was seen while running secondary namenode with heap of 1.5G and an
image of 300MB (on disk). At each checkpoint heap usage kept of increasing for
SN, even force GC did not free the space. After about 4 checkpoints SN crashed
with OutOfMemory exception.
{noformat}
2008-07-30 18:41:01,527 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Throwable Exception
in doCheckpoint:
2008-07-30 18:41:01,528 ERROR
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.HashMap.addEntry(HashMap.java:753)
at java.util.HashMap.put(HashMap.java:385)
at
org.apache.hadoop.hdfs.server.namenode.BlocksMap.checkBlockInfo(BlocksMap.java:302)
at
org.apache.hadoop.hdfs.server.namenode.BlocksMap.addINode(BlocksMap.java:316)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addToParent(FSDirectory.java:238)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:819)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:571)
at
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:468)
{noformat}
At each checkpoint, I see that FSNamesystem object count incremented and not
released.
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
254: 1 288 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
550: 1 56
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
881: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
193: 2 576 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
419: 2 112
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
879: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
154: 3 864 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
343: 3 168
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
891: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$ jmap -histo:live 2057 | grep FSName
122: 4 1152 org.apache.hadoop.hdfs.server.namenode.FSNamesystem
294: 4 224
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics
860: 1 16
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1
[lohit@ ~]$
{noformat}
{noformat}
> SecondaryNameNode runs out of memory
> ------------------------------------
>
> Key: HADOOP-3865
> URL: https://issues.apache.org/jira/browse/HADOOP-3865
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.0
> Reporter: Lohit Vijayarenu
> Fix For: 0.18.0
>
>
> SecondaryNameNode has memory leak. If we leave secondary namenode to run for
> a while doing several checkpoints it runs out of heap.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.