[
https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106231#comment-14106231
]
Juan Yu commented on HDFS-6908:
-------------------------------
Thanks [~jingzhao].
because the directory is deleted, it means the file created between prior
snapshot and the deleting one must be deleted as well. so there are
create/delete pair operations for those files. the file diff processing part
will add the file to removedINodes list. when I debug the fix, I saw the inode
for the file are deleted correctly, no leak. and the intermediate create/delete
file change is cleaned after combining the diff with prior one as well.
{code}
} else if (topNode.isFile() && topNode.asFile().isWithSnapshot()) {
INodeFile file = topNode.asFile();
counts.add(file.getDiffs().deleteSnapshotDiff(post, prior, file,
collectedBlocks, removedINodes, countDiffChange));
{code}
> incorrect snapshot directory diff generated by snapshot deletion
> ----------------------------------------------------------------
>
> Key: HDFS-6908
> URL: https://issues.apache.org/jira/browse/HDFS-6908
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: snapshots
> Reporter: Juan Yu
> Assignee: Juan Yu
> Priority: Critical
> Attachments: HDFS-6908.001.patch
>
>
> In the following scenario, delete snapshot could generate incorrect snapshot
> directory diff and corrupted fsimage, if you restart NN after that, you will
> get NullPointerException.
> 1. create a directory and create a file under it
> 2. take a snapshot
> 3. create another file under that directory
> 4. take second snapshot
> 5. delete both files and the directory
> 6. delete second snapshot
> incorrect directory diff will be generated.
> Restart NN will throw NPE
> {code}
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246)
> at
> org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265)
> at
> org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328)
> at
> org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554)
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)