[
https://issues.apache.org/jira/browse/HDFS-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959227#comment-13959227
]
Uma Maheswara Rao G commented on HDFS-6186:
-------------------------------------------
{quote}
Another jira I was planning to create is, when trash and snapshots are not
enabled, when a large directory (say X number of blocks) is deleted, instead of
deleting the directory, moving it to another directory (say .trash or
/.reserved/.pending_delete). The directory will be deleted after a configured
timeout. What do you think?
{quote}
Sounds like a good idea. The basic difference from actual Trash would that
trash is for all files globally on namespace and this is for specific to the
dirs which large number of blocks for the files under it. Here that X also
would be configurable.
{quote}
It will add the block to the invalidate list for the following block report
(which will be one hour later). So I think what we can first do is to show the
# of blocks that NN cannot recognize in the first block report to the WebUI?
{quote}
You are right. We will not invalidate the blocks in first block report. We
should include the blocks details in webui which are not related any files from
the firstblock report itself.
{quote}
In the meanwhile, in case that we just restart NN while DNs are still running
(e.g., we restart SBN while ANN is still running and then we trigger the
failover), currently NN may process an IBR before a full block report. Then the
first full block report sent to NN after its restarting can trigger the block
deletion immediately.
{quote}
This is a good case. Currently we make the decision based on the num stored
blocks 0 at NN. But in the case where NN processed IBR first, will break this
case and actual full report will be processed. So, we should make the decision
on actual API call to decide whether first full block report came or not. When
processing IBR, if the storedblocks are 0, then it can set the fullBR not
received yet. When full BR received, processReport can check storedblocks are 0
(or) fullBRNotRecieved=true. So, in both cases we should treat this call as
first BR.
> Pause deletion of blocks when the namenode starts up
> ----------------------------------------------------
>
> Key: HDFS-6186
> URL: https://issues.apache.org/jira/browse/HDFS-6186
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Suresh Srinivas
>
> HDFS namenode can delete blocks very quickly, given the deletion happens as a
> parallel operation spread across many datanodes. One of the frequent
> anxieties I see is that a lot of data can be deleted very quickly, when a
> cluster is brought up, especially when one of the storage directories has
> failed and namenode metadata was copied from another storage. Copying wrong
> metadata would results in some of the newer files (if old metadata was
> copied) being deleted along with their blocks.
> HDFS-5986 now captures the number of pending deletion block on namenode webUI
> and JMX. I propose pausing deletion of blocks for a configured period of time
> (default 1 hour?) after namenode comes out of safemode. This will give enough
> time for the administrator to notice large number of pending deletion blocks
> and take corrective action.
> Thoughts?
--
This message was sent by Atlassian JIRA
(v6.2#6252)