[ 
https://issues.apache.org/jira/browse/HDFS-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109935#comment-16109935
 ] 

Erik Krogen edited comment on HDFS-12131 at 8/1/17 10:47 PM:
-------------------------------------------------------------

Hey [~andrew.wang], thanks for continuing to help work on this. I added tests 
for {{VolumeFailuresTotal}}, {{EstimatedCapacityLostTotal}}, and 
{{DecommissioningDataNodes}} in v005 patch.

I did not write tests for {{NumInMaintenance(Live|Dead)DataNodes}}, 
{{NumEnteringMaintenanceDataNodes}}, and {{NumStaleStorages}}. The code to 
coerce the maintenance state transitions in {{TestMaintenanceState}} relies 
heavily on functions from {{AdminStatesBaseTest}} which I would rather not 
replicate just to test a metric output (when the underlying value is already 
being tested). I can't find existing test code showing how to coerce a nonzero 
{{NumStaleStorages}} and again would rather not spend too much effort trying to 
test just a metric value - I think time would be better spent actually adding a 
test for stale storages if one does not yet exist (I was unable to find one 
that I might be able to use as an example). If you have pointers let me know.


was (Author: xkrogen):
Hey [~andrew.wang], thanks for continuing to help work on this. I added tests 
for {{VolumeFailuresTotal}}, {{EstimatedCapacityLostTotal}}, and 
{{DecommissioningDataNodes}}.

I did not write tests for {{NumInMaintenance(Live|Dead)DataNodes}}, 
{{NumEnteringMaintenanceDataNodes}}, and {{NumStaleStorages}}. The code to 
coerce the maintenance state transitions in {{TestMaintenanceState}} relies 
heavily on functions from {{AdminStatesBaseTest}} which I would rather not 
replicate just to test a metric output (when the underlying value is already 
being tested). I can't find existing test code showing how to coerce a nonzero 
{{NumStaleStorages}} and again would rather not spend too much effort trying to 
test just a metric value - I think time would be better spent actually adding a 
test for stale storages if one does not yet exist (I was unable to find one 
that I might be able to use as an example). If you have pointers let me know.

> Add some of the FSNamesystem JMX values as metrics
> --------------------------------------------------
>
>                 Key: HDFS-12131
>                 URL: https://issues.apache.org/jira/browse/HDFS-12131
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, namenode
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>            Priority: Minor
>         Attachments: HDFS-12131.000.patch, HDFS-12131.001.patch, 
> HDFS-12131.002.patch, HDFS-12131.002.patch, HDFS-12131.003.patch, 
> HDFS-12131.004.patch, HDFS-12131.005.patch
>
>
> A number of useful numbers are emitted via the FSNamesystem JMX, but not 
> through the metrics system. These would be useful to be able to track over 
> time, e.g. to alert on via standard metrics systems or to view trends and 
> rate changes:
> * NumLiveDataNodes
> * NumDeadDataNodes
> * NumDecomLiveDataNodes
> * NumDecomDeadDataNodes
> * NumDecommissioningDataNodes
> * NumStaleStorages
> * VolumeFailuresTotal
> * EstimatedCapacityLostTotal
> * NumInMaintenanceLiveDataNodes
> * NumInMaintenanceDeadDataNodes
> * NumEnteringMaintenanceDataNodes
> This is a simple change that just requires annotating the JMX methods with 
> {{@Metric}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to