[
https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517228#comment-16517228
]
Kitti Nanasi commented on HDFS-13658:
-------------------------------------
Thanks for the comments, [~xiaochen]! I uploaded patch v005 which fixes most of
your comments:
* I added a "listonereplicablocks" flag to fsck
* isStriped checking is removed
* I added test coverage in TestNameNodeMetrics.
* I verify for the new metric in the existing test cases of
TestLowRedundancyBlockQueues and that covers the test cases suggested by you.
However I kept the TestOneReplicaBlocksAlert integration test as well, because
that checks if everything is working together well. Do you think I should keep
or remove the integration test?
* About using a set instead of a single long: If I use a long, the metric
increment still works fine, because I know the number of the current replicas,
however when the metric decrement happens, I would need the information on what
was the previous replica number when the previous increment happened. But the
metric decrement can happen from various reasons, for example if the whole file
was removed, or if more replicas were created for the block, and in some cases
there is no information on what was the previous replica count. But I agree
with you that I shouldn't store the block infos. Do you have any suggestions on
how to fix that?
I can only think of creating another priority queue in LowRedundancyBlocks, but
I probably that would ruin a bunch of other things, or if I don't store the
whole block info, just its id for example.
> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have
> 1 replica
> ---------------------------------------------------------------------------------------
>
> Key: HDFS-13658
> URL: https://issues.apache.org/jira/browse/HDFS-13658
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.1.0
> Reporter: Kitti Nanasi
> Assignee: Kitti Nanasi
> Priority: Major
> Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch,
> HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch
>
>
> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have
> 1 replica. We have had many cases opened in which a customer has lost a disk
> or a DN losing files/blocks due to the fact that they had blocks with only 1
> replica. We need to make the customer better aware of this situation and that
> they should take action.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]