[
https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335523#comment-14335523
]
Allen Wittenauer commented on HDFS-7537:
----------------------------------------
bq. When numUnderMinimalRelicatedBlocks > 0 and there is no missing/corrupted
block, all under minimal replicated blocks have at least one good replica so
that they can be replicated and there is no data loss. It makes sense to
consider the file system as healthy.
Exactly this.
I made a prototype to play with. One of things I did was put the number of
blocks that didn't meet the replication minimum surrounded by the asterisks
that the corrupted output did. This made it absolutely crystal clear why the
NN wasn't coming out of safemode.
> fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas
> && NN restart
> -----------------------------------------------------------------------------------------
>
> Key: HDFS-7537
> URL: https://issues.apache.org/jira/browse/HDFS-7537
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Allen Wittenauer
> Assignee: GAO Rui
> Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png
>
>
> If minimum replication is set to 2 or higher and some of those replicas are
> missing and the namenode restarts, it isn't always obvious that the missing
> replicas are the reason why the namenode isn't leaving safemode. We should
> improve the output of fsck and the web UI to make it obvious that the missing
> blocks are from unmet replicas vs. completely/totally missing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)