[
https://issues.apache.org/jira/browse/HDFS-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505336#comment-14505336
]
Zhe Zhang commented on HDFS-8193:
---------------------------------
Thanks Chris for bringing up the questions.
bq. HDFS-6186 only applies at NameNode startup. Is the new feature something
that could be triggered at any time on a running NameNode, such as right before
a manual HA failover?
Short answer is yes. One can imagine it as a "trash" for block replicas, fully
controlled by the DN hosting them. This should shelter block replicas from most
admin mis-operations and NN bugs (more likely than DN bugs given the
complexity) for a period of time.
To answer the question from [~sureshms] under HDFS-6186:
bq. One problem with not deleting the blocks for a deleted file is, how does
one restore it? Can we address in this jira pausing deletion after startup and
address the suggestion you have made, along with other changes that might be
necessary, in another jira.
First, NN bugs could cause block replicas to be deleted without deleting the
file. Second, it's rather easy to back up NN metadata before performing
maintenance, but extremely difficult to back up actual DN data. This JIRA aims
to address that deficiency / discrepancy.
As future work, we plan to investigate an even more radical retention policy,
where block replicas are never deleted before DN is actually running out of
space. At that moment, victims are selected among pending-deletion replicas
using a smart algorithm, and are overwritten by incoming replicas. We'll file a
separate JIRA for that, after this JIRA builds the basic DN-side replica
retention machinery.
> Add the ability to delay replica deletion for a period of time
> --------------------------------------------------------------
>
> Key: HDFS-8193
> URL: https://issues.apache.org/jira/browse/HDFS-8193
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: namenode
> Affects Versions: 2.7.0
> Reporter: Aaron T. Myers
> Assignee: Zhe Zhang
>
> When doing maintenance on an HDFS cluster, users may be concerned about the
> possibility of administrative mistakes or software bugs deleting replicas of
> blocks that cannot easily be restored. It would be handy if HDFS could be
> made to optionally not delete any replicas for a configurable period of time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)