[jira] [Commented] (HDFS-6186) Pause deletion of blocks when the namenode starts up

Uma Maheswara Rao G (JIRA) Thu, 03 Apr 2014 14:15:34 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959227#comment-13959227
 ]


Uma Maheswara Rao G commented on HDFS-6186:
-------------------------------------------

{quote}
Another jira I was planning to create is, when trash and snapshots are not 
enabled, when a large directory (say X number of blocks) is deleted, instead of 
deleting the directory, moving it to another directory (say .trash or 
/.reserved/.pending_delete). The directory will be deleted after a configured 
timeout. What do you think?
{quote}
Sounds like a good idea. The basic difference from actual Trash  would that 
trash is for all files globally on namespace and this is for specific to the 
dirs which large number of blocks for the files under it. Here that X also 
would be configurable.

{quote}
It will add the block to the invalidate list for the following block report 
(which will be one hour later). So I think what we can first do is to show the 
# of blocks that NN cannot recognize in the first block report to the WebUI?
{quote}
You are right. We will not invalidate the blocks in first block report. We 
should include the blocks details in webui which are not related any files from 
the firstblock report itself.

 {quote}
In the meanwhile, in case that we just restart NN while DNs are still running 
(e.g., we restart SBN while ANN is still running and then we trigger the 
failover), currently NN may process an IBR before a full block report. Then the 
first full block report sent to NN after its restarting can trigger the block 
deletion immediately.
{quote}
This is a good case. Currently we make the decision based on the num stored 
blocks 0 at NN.  But in the case where NN processed IBR first, will break this 
case and actual full report will be processed. So, we should make the decision 
on actual API call to decide whether first full block report came or not. When 
processing IBR, if the storedblocks are 0, then it can set the fullBR not 
received yet. When full BR received, processReport can check storedblocks are 0 
(or) fullBRNotRecieved=true. So, in both cases we should treat this call as 
first BR.

> Pause deletion of blocks when the namenode starts up
> ----------------------------------------------------
>
>                 Key: HDFS-6186
>                 URL: https://issues.apache.org/jira/browse/HDFS-6186
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Suresh Srinivas
>
> HDFS namenode can delete blocks very quickly, given the deletion happens as a 
> parallel operation spread across many datanodes. One of the frequent 
> anxieties I see is that a lot of data can be deleted very quickly, when a 
> cluster is brought up, especially when one of the storage directories has 
> failed and namenode metadata was copied from another storage. Copying wrong 
> metadata would results in some of the newer files (if old metadata was 
> copied) being deleted along with their blocks. 
> HDFS-5986 now captures the number of pending deletion block on namenode webUI 
> and JMX. I propose pausing deletion of blocks for a configured period of time 
> (default 1 hour?) after namenode comes out of safemode. This will give enough 
> time for the administrator to notice large number of pending deletion blocks 
> and take corrective action.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6186) Pause deletion of blocks when the namenode starts up

Reply via email to