[jira] [Updated] (HDFS-5428) under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode

Vinay (JIRA) Mon, 04 Nov 2013 01:22:45 -0800

     [ 
https://issues.apache.org/jira/browse/HDFS-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vinay updated HDFS-5428:
------------------------

    Attachment: HDFS-5428-v2.patch

Attaching the patch for the following

*Scenario 1:*
Stores the complete snapshot path in the leases whenever a dir/file which is 
having a snapshot is deleted.
 ex:
1. /foo/bar is a dir containing /foo/bar/f1 and /foo/bar/f2 which is having a 
snapshot /foo/.snaphost/s1
2. Now if /foo/bar is deleted, then there will be two leases 
(/foo/.snapshot/s1/bar/f1 and /foo/.snapshot/s1/bar/f2)  added with the holder 
"HDFS_snapshot" to leaseManager, these leases will be present till the snapshot 
is deleted. Will not be considered for lease recovery.
3. Now on checkpoint, these leases also will be stored as underconstruction 
files with snapshot path.
4. These INodes will be reloaded as under construction files replacing the last 
block as underconstruction. 
5. While considering the namenode safemode threshold these underconstruction 
blocks will be excluded.
6. NameNode startup will be success.

*Scenario 2:*
 Renaming a file/dir inside a snapshot will also be maintained using leases.
ex: 
1. /foo/bar is a dir containing /foo/bar/f1 and /foo/bar/f2 which is having a 
snapshot /foo/.snaphost/s1
  2. Now /foo/bar is renamed to /foo/bar-renamed
    3. then two leases will be added with snapshot paths.
    4. Again while checking pointing these will be written
    5. While counting for the namenode threshold if there are two leases for 
the same file then only the original file lease will be considered and 
threshold will be correct

*Scenario 3:*
Deleting a snapshot of which contains a file with multiple snapshots.
ex: 
1. /foo/bar is a dir containing /foo/bar/f1 and /foo/bar/f2 which is having two 
snapshots /foo/.snaphost/s1 and /foo/.snaphost/s2
2. Now if the /foo/bar is deleted then leases will be created with latest 
snapshot paths. (/foo/.snapshot/s2/bar/f1 and /foo/.snapshot/s2/bar/f2)
3. After this if the latest snapshot (s2) is deleted, then leases will be 
replaced with prior snapshot path for those files which are present in both 
these snapshots


Please review

> under construction files deletion after snapshot+checkpoint+nn restart leads 
> nn safemode
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5428
>                 URL: https://issues.apache.org/jira/browse/HDFS-5428
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Vinay
>            Assignee: Vinay
>         Attachments: HDFS-5428-v2.patch, HDFS-5428.patch
>
>
> 1. allow snapshots under dir /foo
> 2. create a file /foo/test/bar and start writing to it
> 3. create a snapshot s1 under /foo after block is allocated and some data has 
> been written to it
> 4. Delete the directory /foo/test
> 4. wait till checkpoint or do saveNameSpace
> 5. restart NN.
> NN enters to safemode.
> Analysis:
> Snapshot nodes loaded from fsimage are always complete and all blocks will be 
> in COMPLETE state. 
> So when the Datanode reports RBW blocks those will not be updated in 
> blocksmap.
> Some of the FINALIZED blocks will be marked as corrupt due to length mismatch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5428) under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode

Reply via email to