[
https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165444#comment-16165444
]
Allen Wittenauer commented on HDFS-12420:
-----------------------------------------
bq. cluster owner, who was visibly distressed.
Well sure. They screwed up. They can either own up to the fact they made a
mistake and learn from it or try to push blame off onto someone or something
else, like their vendor. Besides, who *doesn't* make a copy of the fsimage
data on a regular basis? That's Hadoop Ops 101.
That said: there comes a point where it becomes impossible to protect every
admin from every mistake they may possibly make.
-format is the functional equivalent of newfs. The argument here is the same
as "newfs should fail if it detects a partition table. You'll need to dd onto
the raw disk to wipe it out first". If you ask any experienced admin, 9/10
they're going to tell you that makes zero sense.
The same thing here. The code specifically warns the user that they are about
to delete live data. Could the messaging be improved? Sure and that's probably
what should be happening if users are confused enough to file this drastic
overreaction. But the warning is there all the same. It is up to the user to
act upon that information and determine it is safe or not to continue with the
operation. If they blindly -force it, well, that's on them. Users might
remove data they need by always doing -skipTrash. So we should remove it,
right? Of course not.
One of the key principals of operations is that admins have enough rope to hang
themselves. This is exactly the same case. In this instance, the admin did
exactly that: hung themselves because they weren't careful.
bq. How you can delete the shared edits dir in journal nodes manually?
I'm really glad you asked that question because it's a key one. It's sort of
ridiculous to have admins go hunt down where Hadoop might be stuffing metadata.
Add in the complexity of HA and it is even more ludicrous.
bq. That said, if you have examples of automated deployments that will be
broken by this change and that we haven't thought of, we can abandon the idea.
I have clients that do this on a regular basis. They regularly roll out small,
short term clusters to external groups. Yes, this change will break them
horribly.
> Disable Namenode format when data already exists
> ------------------------------------------------
>
> Key: HDFS-12420
> URL: https://issues.apache.org/jira/browse/HDFS-12420
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Ajay Kumar
> Assignee: Ajay Kumar
> Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
>
>
> Disable NameNode format to avoid accidental formatting of Namenode in
> production cluster. If someone really wants to delete the complete fsImage,
> they can first delete the metadata dir and then run {code} hdfs namenode
> -format{code} manually.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]