[
https://issues.apache.org/jira/browse/HDFS-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896814#comment-16896814
]
Shashikant Banerjee commented on HDFS-13101:
--------------------------------------------
Attached a patch containing a test case which reproduces the possible fsImage
corruption while deleting a snapshot and detected when "saveNamespace" command
gets executed after that:
[^HDFS-13101.corruption_repro.patch]
The test fails with the following signature when saveNameSpaceCommand gets
executed:
{code:java}
2019-07-31 12:04:07,204 [IPC Server handler 0 on default port 55561] INFO
namenode.FSImage (FSImage.java:saveNamespace(1143)) - Save namespace ...
2019-07-31 12:04:07,215 [FSImageSaver for
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-2
of type IMAGE_AND_EDITS] ERROR namenode.FSImage
(FSImageFormatPBINode.java:serializeINodeDirectorySection(544)) -
FSImageFormatPBINode#serializeINodeDirectorySection: Dangling child pointer
found. Missing INode in inodeMap: id=16394; path=file1; parent=null
2019-07-31 12:04:07,215 [FSImageSaver for
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-1
of type IMAGE_AND_EDITS] ERROR namenode.FSImage
(FSImageFormatPBINode.java:serializeINodeDirectorySection(544)) -
FSImageFormatPBINode#serializeINodeDirectorySection: Dangling child pointer
found. Missing INode in inodeMap: id=16394; path=file1; parent=null
2019-07-31 12:04:07,232 [FSImageSaver for
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-1
of type IMAGE_AND_EDITS] ERROR namenode.FSImage
(FSImage.java:saveFSImage(993)) - Detected 1 errors while saving FsImage
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-1/current/fsimage_0000000000000000027
2019-07-31 12:04:07,232 [FSImageSaver for
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-2
of type IMAGE_AND_EDITS] ERROR namenode.FSImage
(FSImage.java:saveFSImage(993)) - Detected 1 errors while saving FsImage
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-2/current/fsimage_0000000000000000027
2019-07-31 12:04:07,244 [IPC Server handler 0 on default port 55561] ERROR
namenode.FSImage (FSImage.java:saveNamespace(1180)) - NameNode process will
exit now... The saved FsImage IMAGE is potentially corrupted.
{code}
Will add more details as per further analysis.
> Yet another fsimage corruption related to snapshot
> --------------------------------------------------
>
> Key: HDFS-13101
> URL: https://issues.apache.org/jira/browse/HDFS-13101
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Yongjun Zhang
> Assignee: Siyao Meng
> Priority: Major
> Attachments: HDFS-13101.001.patch, HDFS-13101.corruption_repro.patch
>
>
> Lately we saw case similar to HDFS-9406, even though HDFS-9406 fix is
> present, so it's likely another case not covered by the fix. We are currently
> trying to collect good fsimage + editlogs to replay to reproduce it and
> investigate.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]