> On March 31, 2016, 10:36 a.m., Dmytro Sen wrote:
> > ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/TimelineMetricReadHelper.java,
> >  line 75
> > <https://reviews.apache.org/r/45518/diff/1/?file=1320249#file1320249line75>
> >
> >     sum divided by count is an average value, but not sum?

DS, that was an intentional change in this patch.

While performing aggregation from METRIC_AGGREGATE to METRIC_AGGREGATE_MINUTE, 
we simply sum all the METRIC_SUM column values in the METRIC_AGGREGATE table, 
which has no real meaning to the user for most metrics.

Let's consider a cluster where there are HBase regionservers on 2 hosts with 1 
and 2 regions respectively. For example, when the user requests for something 
like "Sum of the regionserver count metric across hosts for the last 3 hours", 
we pick it up from the METRIC_AGGREGATE_MINUTE table.  Hence, we will get a 
value of 30 (3 multiplied by 10 30-second slices in the 5 minute window got 
from METRIC_AGGREGATE). Rather we would want the same "3", which is the sum of 
the regions in the cluster. So, there are 2 levels of aggregation - one across 
timeseries (hosts) and other across hosts (downsampling). We have decided to go 
with AVG for the downsampling method by default, and plan to make it 
configurable in the future.

Similarly, for METRIC_RECORD table, we have changed it in the read path.


- Aravindan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45518/#review126290
-----------------------------------------------------------


On March 31, 2016, 1:36 a.m., Aravindan Vijayan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45518/
> -----------------------------------------------------------
> 
> (Updated March 31, 2016, 1:36 a.m.)
> 
> 
> Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15638
>     https://issues.apache.org/jira/browse/AMBARI-15638
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Sum Calculation is incorrect when the time range is more than 2 hrs.
> This issue affects all metrics that are queried with the "sum" aggregator 
> (with or without hostname being specified) for more than a 2hr time-range.
> 
> 
> Diffs
> -----
> 
>   
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/aggregators/TimelineMetricReadHelper.java
>  846ae92 
>   
> ambari-metrics/ambari-metrics-timelineservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/metrics/timeline/query/PhoenixTransactSQL.java
>  e67a5b8 
> 
> Diff: https://reviews.apache.org/r/45518/diff/
> 
> 
> Testing
> -------
> 
> Manually tested. Unit tests pass.
> 
> 
> Thanks,
> 
> Aravindan Vijayan
> 
>

Reply via email to