[ https://issues.apache.org/jira/browse/HDFS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438209#comment-13438209 ]
Andy Isaacson commented on HDFS-2554: ------------------------------------- bq. Yea, it's a bummer to do that much computation with the lock held. Have you looked at alternatives like keeping the stats in the CorruptReplicasMap? I'll take another look at updating the metrics on the fly. bq. Seems more logical if the interface is getMissingBlocks (all blocks), and getMissingBlocksWithRepl1 (the repl=1 count) and people who want the delta subtract (ditto with the metrics names, "MissingBlocks" and "MissingBlocksRepl1" instead of "R1" and "R2N" Maybe I'm missing something, but this seems to be the same change you suggested in a comment above dated 07/Aug/12. I responded to it above, it seems much more natural to me to provide values A and B which add to give C than to provide A and C which subtracted give B. Relatedly, the administrative action recommended to deal with missing/corrupt blocks are linked to the replication count. "Dear admin, you have unreplicated files with missing blocks, might want to delete them" and "Dear admin, you have replicated files with missing blocks, please bring some DNs back online to allow file recovery". bq. Nit: either pull out the metrics comment change to HDFS-3815 or update the javadoc comment in this change to match Yep, thanks, I'll update the javadoc too. bq. Needs a test Inbound. > Add separate metrics for missing blocks with desired replication level 1 > ------------------------------------------------------------------------ > > Key: HDFS-2554 > URL: https://issues.apache.org/jira/browse/HDFS-2554 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 2.0.0-alpha > Reporter: Todd Lipcon > Assignee: Andy Isaacson > Priority: Minor > Attachments: hdfs-2554-1.txt, hdfs-2554.txt > > > Some users use replication level set to 1 for datasets which are unimportant > and can be lost with no worry (eg the output of terasort tests). But other > data on the cluster is important and should not be lost. It would be useful > to separate the metric for missing blocks by the desired replication level of > those blocks, so that one could ignore missing blocks at repl 1 while still > alerting on missing blocks with higher desired replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira