[
https://issues.apache.org/jira/browse/HDFS-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109935#comment-16109935
]
Erik Krogen edited comment on HDFS-12131 at 8/1/17 10:47 PM:
-------------------------------------------------------------
Hey [~andrew.wang], thanks for continuing to help work on this. I added tests
for {{VolumeFailuresTotal}}, {{EstimatedCapacityLostTotal}}, and
{{DecommissioningDataNodes}} in v005 patch.
I did not write tests for {{NumInMaintenance(Live|Dead)DataNodes}},
{{NumEnteringMaintenanceDataNodes}}, and {{NumStaleStorages}}. The code to
coerce the maintenance state transitions in {{TestMaintenanceState}} relies
heavily on functions from {{AdminStatesBaseTest}} which I would rather not
replicate just to test a metric output (when the underlying value is already
being tested). I can't find existing test code showing how to coerce a nonzero
{{NumStaleStorages}} and again would rather not spend too much effort trying to
test just a metric value - I think time would be better spent actually adding a
test for stale storages if one does not yet exist (I was unable to find one
that I might be able to use as an example). If you have pointers let me know.
was (Author: xkrogen):
Hey [~andrew.wang], thanks for continuing to help work on this. I added tests
for {{VolumeFailuresTotal}}, {{EstimatedCapacityLostTotal}}, and
{{DecommissioningDataNodes}}.
I did not write tests for {{NumInMaintenance(Live|Dead)DataNodes}},
{{NumEnteringMaintenanceDataNodes}}, and {{NumStaleStorages}}. The code to
coerce the maintenance state transitions in {{TestMaintenanceState}} relies
heavily on functions from {{AdminStatesBaseTest}} which I would rather not
replicate just to test a metric output (when the underlying value is already
being tested). I can't find existing test code showing how to coerce a nonzero
{{NumStaleStorages}} and again would rather not spend too much effort trying to
test just a metric value - I think time would be better spent actually adding a
test for stale storages if one does not yet exist (I was unable to find one
that I might be able to use as an example). If you have pointers let me know.
> Add some of the FSNamesystem JMX values as metrics
> --------------------------------------------------
>
> Key: HDFS-12131
> URL: https://issues.apache.org/jira/browse/HDFS-12131
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs, namenode
> Reporter: Erik Krogen
> Assignee: Erik Krogen
> Priority: Minor
> Attachments: HDFS-12131.000.patch, HDFS-12131.001.patch,
> HDFS-12131.002.patch, HDFS-12131.002.patch, HDFS-12131.003.patch,
> HDFS-12131.004.patch, HDFS-12131.005.patch
>
>
> A number of useful numbers are emitted via the FSNamesystem JMX, but not
> through the metrics system. These would be useful to be able to track over
> time, e.g. to alert on via standard metrics systems or to view trends and
> rate changes:
> * NumLiveDataNodes
> * NumDeadDataNodes
> * NumDecomLiveDataNodes
> * NumDecomDeadDataNodes
> * NumDecommissioningDataNodes
> * NumStaleStorages
> * VolumeFailuresTotal
> * EstimatedCapacityLostTotal
> * NumInMaintenanceLiveDataNodes
> * NumInMaintenanceDeadDataNodes
> * NumEnteringMaintenanceDataNodes
> This is a simple change that just requires annotating the JMX methods with
> {{@Metric}}.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]