[ 
https://issues.apache.org/jira/browse/HDFS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438209#comment-13438209
 ] 

Andy Isaacson commented on HDFS-2554:
-------------------------------------

bq. Yea, it's a bummer to do that much computation with the lock held. Have you 
looked at alternatives like keeping the stats in the CorruptReplicasMap?

I'll take another look at updating the metrics on the fly.

bq. Seems more logical if the interface is getMissingBlocks (all blocks), and 
getMissingBlocksWithRepl1 (the repl=1 count) and people who want the delta 
subtract (ditto with the metrics names, "MissingBlocks" and 
"MissingBlocksRepl1" instead of "R1" and "R2N"

Maybe I'm missing something, but this seems to be the same change you suggested 
in a comment above dated 07/Aug/12.  I responded to it above, it seems much 
more natural to me to provide values A and B which add to give C than to 
provide A and C which subtracted give B.

Relatedly, the administrative action recommended to deal with missing/corrupt 
blocks are linked to the replication count.  "Dear admin, you have unreplicated 
files with missing blocks, might want to delete them" and "Dear admin, you have 
replicated files with missing blocks, please bring some DNs back online to 
allow file recovery".

bq. Nit: either pull out the metrics comment change to HDFS-3815 or update the 
javadoc comment in this change to match
Yep, thanks, I'll update the javadoc too.

bq. Needs a test

Inbound.
                
> Add separate metrics for missing blocks with desired replication level 1
> ------------------------------------------------------------------------
>
>                 Key: HDFS-2554
>                 URL: https://issues.apache.org/jira/browse/HDFS-2554
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andy Isaacson
>            Priority: Minor
>         Attachments: hdfs-2554-1.txt, hdfs-2554.txt
>
>
> Some users use replication level set to 1 for datasets which are unimportant 
> and can be lost with no worry (eg the output of terasort tests). But other 
> data on the cluster is important and should not be lost. It would be useful 
> to separate the metric for missing blocks by the desired replication level of 
> those blocks, so that one could ignore missing blocks at repl 1 while still 
> alerting on missing blocks with higher desired replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to