[jira] [Commented] (HDFS-5428) under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode

Vinay (JIRA) Thu, 07 Nov 2013 02:27:07 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815821#comment-13815821
 ]


Vinay commented on HDFS-5428:
-----------------------------

bq. We will replace the whole Inode if it is in normal path. 
Here we will replace whole Inode only if its underconstruction. What if the 
same file is closed and present in some other path.?
bq.  Another option here is that we replace the inode for all the cases. To 
cover the challenge that we cannot get the full snapshot path, we can use the 
inode id to get the inode first, then scan the diff list of its parent to do 
the replacement. This will be inefficient but might be ok in case that we do 
not have a lot of snapshots and inodeUC.
To what level of scanning we can do..? And how we can find out the all previous 
locations of the inode. same INode might be renamed to different locations in 
snapshot

bq. For rename, we will only have one INode here, which is referenced by two 
INodeReference instances stored in s1 and s2. And since we only record inode id 
in snapshotUCMap, this scenario might be fine?
I am not sure about this. As far as I have seen while debugging if there is any 
modification done (such as adding one more block) on snapshotted node, a new 
inode instance will be saved inside snaphot diffs, not the INodeReference. 
INodeReference  will be used only if there is no modification between two 
inodes attributes other than name. 
Actually I got this point, because I have already faced these problems while 
preparing my patch. 

> under construction files deletion after snapshot+checkpoint+nn restart leads 
> nn safemode
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5428
>                 URL: https://issues.apache.org/jira/browse/HDFS-5428
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Vinay
>            Assignee: Vinay
>         Attachments: HDFS-5428-v2.patch, HDFS-5428.000.patch, 
> HDFS-5428.001.patch, HDFS-5428.patch
>
>
> 1. allow snapshots under dir /foo
> 2. create a file /foo/test/bar and start writing to it
> 3. create a snapshot s1 under /foo after block is allocated and some data has 
> been written to it
> 4. Delete the directory /foo/test
> 5. wait till checkpoint or do saveNameSpace
> 6. restart NN.
> NN enters to safemode.
> Analysis:
> Snapshot nodes loaded from fsimage are always complete and all blocks will be 
> in COMPLETE state. 
> So when the Datanode reports RBW blocks those will not be updated in 
> blocksmap.
> Some of the FINALIZED blocks will be marked as corrupt due to length mismatch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5428) under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode

Reply via email to