[ 
https://issues.apache.org/jira/browse/HDFS-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815766#comment-13815766
 ] 

Vinay commented on HDFS-5428:
-----------------------------

bq. So here my question is whether it's possible that we just replace the last 
block of the snapshot INode with a BlockInfoUC (but without replacing the 
INodeFile with an INodeFileUC)?
If we replace the problem is, if the same INode is referring to a completed 
file [  might be due to rename and leaserecovery ] in normal path and replacing 
a last block in this INode may not be correct.

And one more problem here is the snapshotUCMap will not always contains the 
latest snapshot inode which will be written to fsmage as underconstruction file.
for ex:
    1. when the file is being written, after allocating block b1, take snapshot 
"s1"
    2. File is renamed.
    3. Now the file is closed by lease recovery. and appended again one more 
block b2, and before closing one more snapshot is taken "s2"
    4. and finally file is deleted.
    5. Now while writing the inode tree to fsimage, inode in s2 comes first and 
then s1 , then only INode in s1 will be marked as underconstruction. but actual 
underconstruction is INode in S2 snapshot

> under construction files deletion after snapshot+checkpoint+nn restart leads 
> nn safemode
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5428
>                 URL: https://issues.apache.org/jira/browse/HDFS-5428
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Vinay
>            Assignee: Vinay
>         Attachments: HDFS-5428-v2.patch, HDFS-5428.000.patch, 
> HDFS-5428.001.patch, HDFS-5428.patch
>
>
> 1. allow snapshots under dir /foo
> 2. create a file /foo/test/bar and start writing to it
> 3. create a snapshot s1 under /foo after block is allocated and some data has 
> been written to it
> 4. Delete the directory /foo/test
> 5. wait till checkpoint or do saveNameSpace
> 6. restart NN.
> NN enters to safemode.
> Analysis:
> Snapshot nodes loaded from fsimage are always complete and all blocks will be 
> in COMPLETE state. 
> So when the Datanode reports RBW blocks those will not be updated in 
> blocksmap.
> Some of the FINALIZED blocks will be marked as corrupt due to length mismatch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to