[ 
https://issues.apache.org/jira/browse/HDFS-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896814#comment-16896814
 ] 

Shashikant Banerjee commented on HDFS-13101:
--------------------------------------------

Attached a patch containing a test case which reproduces the possible fsImage 
corruption while deleting a snapshot and detected when "saveNamespace" command 
gets executed after that:
 [^HDFS-13101.corruption_repro.patch]

The test fails with the following signature when saveNameSpaceCommand gets 
executed:
{code:java}
2019-07-31 12:04:07,204 [IPC Server handler 0 on default port 55561] INFO 
namenode.FSImage (FSImage.java:saveNamespace(1143)) - Save namespace ...
2019-07-31 12:04:07,215 [FSImageSaver for 
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-2
 of type IMAGE_AND_EDITS] ERROR namenode.FSImage 
(FSImageFormatPBINode.java:serializeINodeDirectorySection(544)) - 
FSImageFormatPBINode#serializeINodeDirectorySection: Dangling child pointer 
found. Missing INode in inodeMap: id=16394; path=file1; parent=null
2019-07-31 12:04:07,215 [FSImageSaver for 
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-1
 of type IMAGE_AND_EDITS] ERROR namenode.FSImage 
(FSImageFormatPBINode.java:serializeINodeDirectorySection(544)) - 
FSImageFormatPBINode#serializeINodeDirectorySection: Dangling child pointer 
found. Missing INode in inodeMap: id=16394; path=file1; parent=null
2019-07-31 12:04:07,232 [FSImageSaver for 
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-1
 of type IMAGE_AND_EDITS] ERROR namenode.FSImage 
(FSImage.java:saveFSImage(993)) - Detected 1 errors while saving FsImage 
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-1/current/fsimage_0000000000000000027
2019-07-31 12:04:07,232 [FSImageSaver for 
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-2
 of type IMAGE_AND_EDITS] ERROR namenode.FSImage 
(FSImage.java:saveFSImage(993)) - Detected 1 errors while saving FsImage 
/Users/sbanerjee/hadoop_commit/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/name-0-2/current/fsimage_0000000000000000027
2019-07-31 12:04:07,244 [IPC Server handler 0 on default port 55561] ERROR 
namenode.FSImage (FSImage.java:saveNamespace(1180)) - NameNode process will 
exit now... The saved FsImage IMAGE is potentially corrupted.
{code}
Will add more details as per further analysis.

> Yet another fsimage corruption related to snapshot
> --------------------------------------------------
>
>                 Key: HDFS-13101
>                 URL: https://issues.apache.org/jira/browse/HDFS-13101
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yongjun Zhang
>            Assignee: Siyao Meng
>            Priority: Major
>         Attachments: HDFS-13101.001.patch, HDFS-13101.corruption_repro.patch
>
>
> Lately we saw case similar to HDFS-9406, even though HDFS-9406 fix is 
> present, so it's likely another case not covered by the fix. We are currently 
> trying to collect good fsimage + editlogs to replay to reproduce it and 
> investigate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to