[ 
https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517228#comment-16517228
 ] 

Kitti Nanasi commented on HDFS-13658:
-------------------------------------

Thanks for the comments, [~xiaochen]! I uploaded patch v005 which fixes most of 
your comments:
 * I added a "listonereplicablocks" flag to fsck
 * isStriped checking is removed
 * I added test coverage in TestNameNodeMetrics.
 * I verify for the new metric in the existing test cases of 
TestLowRedundancyBlockQueues and that covers the test cases suggested by you. 
However I kept the TestOneReplicaBlocksAlert integration test as well, because 
that checks if everything is working together well. Do you think I should keep 
or remove the integration test?
 * About using a set instead of a single long: If I use a long, the metric 
increment still works fine, because I know the number of the current replicas, 
however when the metric decrement happens, I would need the information on what 
was the previous replica number when the previous increment happened. But the 
metric decrement can happen from various reasons, for example if the whole file 
was removed, or if more replicas were created for the block, and in some cases 
there is no information on what was the previous replica count. But I agree 
with you that I shouldn't store the block infos. Do you have any suggestions on 
how to fix that?
I can only think of creating another priority queue in LowRedundancyBlocks, but 
I probably that would ruin a bunch of other things, or if I don't store the 
whole block info, just its id for example.

> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-13658
>                 URL: https://issues.apache.org/jira/browse/HDFS-13658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.1.0
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Major
>         Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, 
> HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch
>
>
> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica. We have had many cases opened in which a customer has lost a disk 
> or a DN losing files/blocks due to the fact that they had blocks with only 1 
> replica. We need to make the customer better aware of this situation and that 
> they should take action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to