[
https://issues.apache.org/jira/browse/HADOOP-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108831#comment-13108831
]
Luke Lu commented on HADOOP-7630:
---------------------------------
bq. Single simon aggregator has problem to handle aggregation load at this
scale.
That's why we use multiple aggregators for different group/context of metrics.
Hadoop metrics are always sent at 5 or 10 second period with no problems at
scale.
bq. No I mean the simon plugin. we want the gauge like metrics to be in sync at
the source (MetricsContext) as well as the plugins
Please look at the title of the jira. This is for metrics2. There is no
MetricsContext. A metrics2 plugin is a MetricsSink implementation and it only
pushes out metrics to aggregators. It doesn't do addition or average, unless I
misunderstood your sentence: "The Simon plugin is only doing add and average of
samples".
bq. This configuration has been verified to be working at 40 nodes scale. I am
sure that it would not cause any harm but reduce the potential breaking point.
10 second period has been verified to be working at 4000 nodes scale. With the
current change, you're relying on zero udp packet loss, which is OK for small
clusters. To give an example why this is a problem: for derived throughput
metrics, which is calculated with (counter-current - counter-last)/period, if
you are missing a few packets, you will see zero throughput in 60 second
windows, which is clearly wrong for many metrics.
There is simply no need to change the period.
In any case, make sure Rajiv know about this (just added Rajiv to the
watchers).
> hadoop-metrics2.properties should have a property *.period set to a default
> value foe metrics
> ---------------------------------------------------------------------------------------------
>
> Key: HADOOP-7630
> URL: https://issues.apache.org/jira/browse/HADOOP-7630
> Project: Hadoop Common
> Issue Type: Bug
> Components: conf
> Reporter: Arpit Gupta
> Assignee: Eric Yang
> Fix For: 0.20.205.0, 0.23.0
>
> Attachments: HADOOP-7630-trunk.patch, HADOOP-7630.patch
>
>
> currently the hadoop-metrics2.properties file does not have a value set for
> *.period
> This property is useful for metrics to determine when the property will
> refresh. We should set it to default of 60
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira