[ 
https://issues.apache.org/jira/browse/HDFS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-2510:
---------------------------------

    Attachment: HDFS-2510.HDFS-1623.patch

Here's a patch which addresses the issue. In addition to the provided test, I 
also tested this manually on a cluster by hitting the /jmx URL and observing 
the values shown there for the new metrics.

I implemented all the metrics above, except for the following:

bq. The difference between highest generation stamp seen from the shared edit 
log and the highest generation stamp seen from any DN

I couldn't think of any legitimate use for this. It seems to serve only as a 
proxy for the size of the pending DN message queues.

bq. It would probably also be useful to have a DN metric which somehow 
describes which active/standby NNs its talking to, e.g. "times since last 
communicated with standby/active NNs."

Similarly, I couldn't think of anything useful an operator could get from this. 
It also doesn't help the situation that currently all DN metrics are 
per-DN-daemon, not per BP offer service. Thus, it's not obvious how to get 
meaningful DN-side metrics for just a single namespace.

I'm certainly open to suggestions for other metrics that people think might be 
useful.
                
> Add HA-related metrics
> ----------------------
>
>                 Key: HDFS-2510
>                 URL: https://issues.apache.org/jira/browse/HDFS-2510
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-2510.HDFS-1623.patch
>
>
> Off the top of my head, I can think of:
> NN metrics:
> * A binary metric for active or standby
> * The size of the pending DN message queues
> * A timestamp for when the standby NN last read from shared edit log
> * The difference between highest generation stamp seen from the shared edit 
> log and the highest generation stamp seen from any DN
> It would probably also be useful to have a DN metric which somehow describes 
> which active/standby NNs its talking to, e.g. "times since last communicated 
> with standby/active NNs."
> I'm sure there are others as well. Comments strongly encouraged.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to