Standby NN

Jiandan Yang (JIRA) Tue, 13 Nov 2018 22:36:32 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686134#comment-16686134
 ]


Jiandan Yang  commented on HDFS-14045:
--------------------------------------

Thanks [~elgoiri] for you comments.
{quota}
TestDataNodeMetrics#testNNRpcMetricsWithFederationAndHA(), 
testNNRpcMetricsWithFederation() and testNNRpcMetricsWithHA(), no need to 
extract the suffix.
{quota}
I've remove suffix in [^HDFS-14045.009.patch]
{quota}
 I'm not sure about the Unknown-Unknown behavior, if we cannot determine the 
id, we may want to just leave it as it was?
{quota}
Do you mean do not make metrics when suffix is Unknown-Unknown？I do not 
understand what your mean.
{quota}
Which unit test makes sure that HeartbeatsNumOps and HeartbeatsAvgTime are 
still showing the old values? It looks good but just to verify.
{quota}
A good suggestion, I've add verification about HeartbeatsNumOps in  
[^HDFS-14045.009.patch]

> Use different metrics in DataNode to better measure latency of 
> heartbeat/blockReports/incrementalBlockReports of Active/Standby NN
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14045
>                 URL: https://issues.apache.org/jira/browse/HDFS-14045
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Jiandan Yang 
>            Assignee: Jiandan Yang 
>            Priority: Major
>         Attachments: HDFS-14045.001.patch, HDFS-14045.002.patch, 
> HDFS-14045.003.patch, HDFS-14045.004.patch, HDFS-14045.005.patch, 
> HDFS-14045.006.patch, HDFS-14045.007.patch, HDFS-14045.008.patch, 
> HDFS-14045.009.patch
>
>
> Currently DataNode uses same metrics to measure rpc latency of NameNode, but 
> Active and Standby usually have different performance at the same time, 
> especially in large cluster. For example, rpc latency of Standby is very long 
> when Standby is catching up editlog. We may misunderstand the state of HDFS. 
> Using different metrics for Active and standby can help us obtain more 
> precise metric data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14045) Use different metrics in DataNode to better measure latency of heartbeat/blockReports/incrementalBlockReports of Active/Standby NN

Reply via email to