[ 
https://issues.apache.org/jira/browse/HDFS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654899#comment-14654899
 ] 

Akira AJISAKA commented on HDFS-6682:
-------------------------------------

Thanks Allen, Andrew, and Yi for the discussion.

bq. whole queue is super busy
As Andrew suggested, recording the rate of addition/removal from 
UnderReplicatedBlocks would be useful and straightforward to me.

bq. old ones never cleared
I agree with Yi that recording the timeout number of pending replication blocks 
is useful to get the cluster health.

> Add a metric to expose the timestamp of the oldest under-replicated block
> -------------------------------------------------------------------------
>
>                 Key: HDFS-6682
>                 URL: https://issues.apache.org/jira/browse/HDFS-6682
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Akira AJISAKA
>            Assignee: Akira AJISAKA
>              Labels: metrics
>         Attachments: HDFS-6682.002.patch, HDFS-6682.003.patch, 
> HDFS-6682.004.patch, HDFS-6682.005.patch, HDFS-6682.006.patch, HDFS-6682.patch
>
>
> In the following case, the data in the HDFS is lost and a client needs to put 
> the same file again.
> # A Client puts a file to HDFS
> # A DataNode crashes before replicating a block of the file to other DataNodes
> I propose a metric to expose the timestamp of the oldest 
> under-replicated/corrupt block. That way client can know what file to retain 
> for the re-try.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to