[ https://issues.apache.org/jira/browse/HDFS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435571#comment-13435571 ]
Andy Isaacson commented on HDFS-2554: ------------------------------------- Eli, You make a good point that c can be greater than r. Taking that into account, my definitions become: # MissingBlocksR1: r=1, n=0, c=0 (eli's #5) # CorruptBlocksR1: r=1, n=0, c>0 (eli's #2) # CorruptBlocksRN: r>1, n=0, c>0 (eli's #3, except c>0 rather than c>=r) # MissingBlocksRN: r>1, n=0, c=0 (eli's #6) You also suggest two additional metrics, #1 and #4, which are derived from the above. Your suggested #1 is equal to CorruptBlocksRN+CorruptBlocksR1, and #4 is equal to MisingBlocksRN+MissingBlocksR1. Given an equation in 3 variables x+y=z, I find it most natural to specify x and y and let z be the derived value, rather than specifying x and z and computing y=z-x. Therefore I think that the above definitions are the most reasonable ones, rather than providing #1 and #2 with #3 as the derived value. Regarding CorruptBlocksRN -- should the definition be c>0 or c>=r ? If we have a block with replicationcount=3 and only 2 replicas, both of which are corrupt, I claim it should be counted in CorruptBlocksRN. Since r=3 c=2, the predicate should be c>0 rather than c>=r. Thoughts? > Add separate metrics for missing blocks with desired replication level 1 > ------------------------------------------------------------------------ > > Key: HDFS-2554 > URL: https://issues.apache.org/jira/browse/HDFS-2554 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 2.0.0-alpha > Reporter: Todd Lipcon > Assignee: Andy Isaacson > Priority: Minor > > Some users use replication level set to 1 for datasets which are unimportant > and can be lost with no worry (eg the output of terasort tests). But other > data on the cluster is important and should not be lost. It would be useful > to separate the metric for missing blocks by the desired replication level of > those blocks, so that one could ignore missing blocks at repl 1 while still > alerting on missing blocks with higher desired replication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira