[
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654647#comment-14654647
]
Aaron T. Myers commented on HDFS-8849:
--------------------------------------
Allen, I've seen plenty of users who at some point in the past have run
TeraSort on their cluster, and for that job the default output replication is
1. If a DN then goes offline that was containing some TeraSort output, then
blocks appear missing and users get concerned because they see missing blocks
on the NN web UI and via dfsadmin -report/fsck, but it's not obvious that those
blocks were in fact set to replication factor 1. In my experience, this is
really quite common, so definitely seems like something worthy of addressing to
me. How we go about addressing this should certainly be discussed, and it could
be that including this information in fsck doesn't make sense, but let's try to
come up with something that does address this issue.
Separately, using phrases like "Meanwhile, back in real life" and calling a
proposed improvement a "useless feature" is not an appropriate way to
communicate in this forum. Let's please try to keep the communication
constructive, not unnecessarily hostile. Comments like those contribute to the
perception that our community is difficult to contribute to.
> fsck should report number of missing blocks with replication factor 1
> ---------------------------------------------------------------------
>
> Key: HDFS-8849
> URL: https://issues.apache.org/jira/browse/HDFS-8849
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: tools
> Affects Versions: 2.7.1
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Priority: Minor
>
> HDFS-7165 supports reporting number of blocks with replication factor 1 in
> {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same
> support, which is the aim of this JIRA.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)