[ 
https://issues.apache.org/jira/browse/HDFS-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250168#comment-16250168
 ] 

Xiaoyu Yao commented on HDFS-12787:
-----------------------------------

Thanks [~linyiqun] for working on this. The patch looks good to me overall. 
Here are a few comments:

*TestSCMMetrics.java*

Line 49: can you add a annotation for timeout of the test case?

Line 117-148: can we add 2-3 non-zero container reports to validate the 
aggregation feature work as expected?

*StorageContainerManager.java*

Line 215: in addition to the aggregrated metrics, can we expose the 
containerReportCache from both API and/or JSON/JMX for the per datanode 
container IO stats? That will be very usefully for cluster monitoring.

Line 318-323: Should we remove the entry only when the node is moved to 
stale/dead in the NodeManager? Expire the entry with 2*container report 
interval may get the container stats removed before node is stale/dead.

Line 332-337: the logic can be simplified without extra variable deltaStat

Line 339: NIT “+” is not needed

Line 974: Agree, to scale to large clusters, we have to process container 
report asyncrounously .

*OzoneMetrics.md*
Line 113:119: It will be helpful to include when and where the last container 
report is from to give more context information. Otherwise, the last container 
report number won't be very useful.

> Ozone: SCM: Aggregate the metrics from all the container reports
> ----------------------------------------------------------------
>
>                 Key: HDFS-12787
>                 URL: https://issues.apache.org/jira/browse/HDFS-12787
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: metrics, ozone
>    Affects Versions: HDFS-7240
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: HDFS-12787-HDFS-7240.001.patch, 
> HDFS-12787-HDFS-7240.002.patch, HDFS-12787-HDFS-7240.003.patch
>
>
> We should aggregate the metrics from all the reports of different datanodes 
> in addition to the last report. This way, we can get a global view of the 
> container I/Os over the ozone cluster. This is a follow up work of HDFS-11468.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to