[
https://issues.apache.org/jira/browse/HDFS-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038957#comment-13038957
]
ramkrishna.s.vasudevan commented on HDFS-1981:
----------------------------------------------
Writing UT for this may be difficult to reproduce the scenario.
The steps that I followed to reproduce this issue are
1. Start namenode and backup namenode
2. Allow checkpointing to happen such that the edits.new file is
created on the namenode.
3. At this point kill the NN and BNN.
4. Now start the NN and BNN.
5. When checkpointing starts again we will get the above exception.
The exact problem comes in the loadFSEdits() api in FSImage.java
Here if the loadFSEdits() api returns 0 then
if (fsImage.recoverTransitionRead(dataDirs, editsDirs, startOpt)) {
fsImage.saveNamespace(true);
}
saveNamespace() will not be invoked.
Kindly correct me if you find any problems in this.
> When namenode goes down while checkpointing and if is started again
> subsequent Checkpointing is always failing
> --------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-1981
> URL: https://issues.apache.org/jira/browse/HDFS-1981
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.23.0
> Environment: Linux
> Reporter: ramkrishna.s.vasudevan
> Fix For: 0.23.0
>
>
> This scenario is applicable in NN and BNN case.
> When the namenode goes down after creating the edits.new, on subsequent
> restart the divertFileStreams will not happen to edits.new as the edits.new
> file is already present and the size is zero.
> so on trying to saveCheckPoint an exception occurs
> 2011-05-23 16:38:57,476 WARN org.mortbay.log: /getimage: java.io.IOException:
> GetImage failed. java.io.IOException: Namenode has an edit log with timestamp
> of 2011-05-23 16:38:56 but new checkpoint was created using editlog with
> timestamp 2011-05-23 16:37:30. Checkpoint Aborted.
> This is a bug or is that the behaviour.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira