[
https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252383#comment-14252383
]
Allen Wittenauer commented on HDFS-7537:
----------------------------------------
Mock-up of an fsck that alerts when min rep hasn't actually been met:
{code}
Status: HEALTHY
Total size: 236 B
Total dirs: 1
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 236 B)
********************************
UNDER MIN REPL'D BLOCKS: 1 (100.0 %)
********************************
Minimally replicated blocks: 0 (0.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 2 (66.666664 %)
Number of data-nodes: 1
Number of racks: 1
{code}
With all datanodes down (and therefore triggering corrupt/missing blocks):
{code}
Status: CORRUPT
Total size: 236 B
Total dirs: 1
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 236 B)
********************************
UNDER MIN REPL'D BLOCKS: 1 (100.0 %)
CORRUPT FILES: 1
MISSING BLOCKS: 1
MISSING SIZE: 236 B
CORRUPT BLOCKS: 1
********************************
Minimally replicated blocks: 0 (0.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 0.0
Corrupt blocks: 1
Missing replicas: 0
Number of data-nodes: 0
Number of racks: 0
FSCK ended at Thu Dec 18 14:08:25 PST 2014 in 13 milliseconds
{code}
> dfs.namenode.replication.min > 1 && missing replicas && NN restart is
> confusing
> -------------------------------------------------------------------------------
>
> Key: HDFS-7537
> URL: https://issues.apache.org/jira/browse/HDFS-7537
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Allen Wittenauer
> Attachments: dfs-min-2-fsck.png, dfs-min-2.png
>
>
> If minimum replication is set to 2 or higher and some of those replicas are
> missing and the namenode restarts, it isn't always obvious that the missing
> replicas are the reason why the namenode isn't leaving safemode. We should
> improve the output of fsck and the web UI to make it obvious that the missing
> blocks are from unmet replicas vs. completely/totally missing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)