[
https://issues.apache.org/jira/browse/HDFS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852535#action_12852535
]
Hairong Kuang commented on HDFS-1002:
-------------------------------------
I took a look at the image & edits that Carlos provided at HDFS-686. It clearly
indicated some edit entries were missing. The missing parent directory
/fields/0001/20100325_1200/c1b1301_wrep_o_12_pp_fc_tp is not in the image and
no other entry in edits contains this directory.
In this case, although addChildNPE.patch avoids the crash, it does not help get
the missing directory back.
> Secondary Name Node crash, NPE in edit log replay
> -------------------------------------------------
>
> Key: HDFS-1002
> URL: https://issues.apache.org/jira/browse/HDFS-1002
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.21.0
> Reporter: ryan rawson
> Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: addChildNPE.patch, snn_crash.tar.gz, snn_log.txt
>
>
> An NPE in SNN, the core of the message looks like yay so:
> 2010-02-25 11:54:05,834 ERROR
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172)
> at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594)
> at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476)
> at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353)
> at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317)
> at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219)
> at java.lang.Thread.run(Thread.java:619)
> This happens even if I restart SNN over and over again.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.