[
https://issues.apache.org/jira/browse/HDFS-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810331#comment-13810331
]
Aaron T. Myers commented on HDFS-5433:
--------------------------------------
Thanks for wrapping this up for me, Andrew. Much appreciated.
Thanks also to Todd and Vinay for the reviews, and to Stephen Chu for finding
this bug.
> When reloading fsimage during checkpointing, we should clear existing
> snapshottable directories
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-5433
> URL: https://issues.apache.org/jira/browse/HDFS-5433
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 2.2.0
> Reporter: Aaron T. Myers
> Assignee: Aaron T. Myers
> Priority: Critical
> Fix For: 2.2.1
>
> Attachments: HDFS-5433-2.patch, HDFS-5433.patch
>
>
> The complete set of snapshottable directories are referenced both via the
> file system tree and in the SnapshotManager class. It's possible that when
> the 2NN performs a checkpoint, it will reload its in-memory state based on a
> new fsimage from the NN, but will not clear the set of snapshottable
> directories referenced by the SnapshotManager. In this case, the 2NN will
> write out an fsimage that cannot be loaded, since the integer written to the
> fsimage indicating the number of snapshottable directories will be out of
> sync with the actual number of snapshottable directories serialized to the
> fsimage.
> This is basically the same as HDFS-3835, but for snapshottable directories
> instead of delegation tokens.
--
This message was sent by Atlassian JIRA
(v6.1#6144)