[
https://issues.apache.org/jira/browse/HDFS-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron T. Myers updated HDFS-5159:
---------------------------------
Attachment: HDFS-5159.patch
Here's a patch which addresses the issue by changing the 2NN to base the
decision to load the fsimage from disk based on its _in-memory_ instead of its
_on-disk_ state.
> Secondary NameNode fails to checkpoint if error occurs downloading edits on
> first checkpoint
> --------------------------------------------------------------------------------------------
>
> Key: HDFS-5159
> URL: https://issues.apache.org/jira/browse/HDFS-5159
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.1.0-beta
> Reporter: Aaron T. Myers
> Assignee: Aaron T. Myers
> Attachments: HDFS-5159.patch
>
>
> The 2NN will avoid downloading/loading a new fsimage if its local copy of
> fsimage is the same as the version on the NN. However, the decision to *load*
> the fsimage from disk into memory is based only on the on-disk fsimage
> version. If an error occurs between downloading and loading the fsimage on
> the first checkpoint attempt, the 2NN will never load the fsimage, and then
> on subsequent checkpoint attempts it will not load the on-disk fsimage and
> thus will never checkpoint successfully.
> Example error message in the first comment of this ticket.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira