[ 
https://issues.apache.org/jira/browse/HDFS-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505336#comment-14505336
 ] 

Zhe Zhang commented on HDFS-8193:
---------------------------------

Thanks Chris for bringing up the questions. 

bq. HDFS-6186 only applies at NameNode startup.  Is the new feature something 
that could be triggered at any time on a running NameNode, such as right before 
a manual HA failover?
Short answer is yes. One can imagine it as a "trash" for block replicas, fully 
controlled by the DN hosting them. This should shelter block replicas from most 
admin mis-operations and NN bugs (more likely than DN bugs given the 
complexity) for a period of time. 

To answer the question from [~sureshms] under HDFS-6186:
bq. One problem with not deleting the blocks for a deleted file is, how does 
one restore it? Can we address in this jira pausing deletion after startup and 
address the suggestion you have made, along with other changes that might be 
necessary, in another jira.
First, NN bugs could cause block replicas to be deleted without deleting the 
file. Second, it's rather easy to back up NN metadata before performing 
maintenance, but extremely difficult to back up actual DN data. This JIRA aims 
to address that deficiency / discrepancy.

As future work, we plan to investigate an even more radical retention policy, 
where block replicas are never deleted before DN is actually running out of 
space. At that moment, victims are selected among pending-deletion replicas 
using a smart algorithm, and are overwritten by incoming replicas. We'll file a 
separate JIRA for that, after this JIRA builds the basic DN-side replica 
retention machinery.

> Add the ability to delay replica deletion for a period of time
> --------------------------------------------------------------
>
>                 Key: HDFS-8193
>                 URL: https://issues.apache.org/jira/browse/HDFS-8193
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Aaron T. Myers
>            Assignee: Zhe Zhang
>
> When doing maintenance on an HDFS cluster, users may be concerned about the 
> possibility of administrative mistakes or software bugs deleting replicas of 
> blocks that cannot easily be restored. It would be handy if HDFS could be 
> made to optionally not delete any replicas for a configurable period of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to