[
https://issues.apache.org/jira/browse/KUDU-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marton Greber updated KUDU-3566:
--------------------------------
Parent: KUDU-3691
Issue Type: Sub-task (was: Bug)
> Incorrect semantics for Prometheus-style histogram metrics
> ----------------------------------------------------------
>
> Key: KUDU-3566
> URL: https://issues.apache.org/jira/browse/KUDU-3566
> Project: Kudu
> Issue Type: Sub-task
> Components: master, tserver
> Affects Versions: 1.17.0
> Reporter: Alexey Serbin
> Priority: Major
> Labels: metrics, observability
> Fix For: 1.18.0
>
>
> Original KUDU-3375 implementation incorrectly exposes [summary-type
> Prometheus metrics|https://prometheus.io/docs/concepts/metric_types/#summary]
> as [histogram-type
> ones|https://prometheus.io/docs/concepts/metric_types/#histogram] for data
> collected by corresponding HDR histograms. For example, below are snippets
> from {{/metric}} and {{/metrics_prometheus}} for statistics on ListMasters
> RPC.
> The data exposed as Prometheus-style histogram metrics should have been
> reported as summary metrics instead.
> JSON-style:
> {noformat}
> {
> "name": "handler_latency_kudu_master_MasterService_ListMasters",
> "total_count": 26,
> "min": 152,
> "mean": 301.2692307692308,
> "percentile_75": 324,
> "percentile_95": 468,
> "percentile_99": 844,
> "percentile_99_9": 844,
> "percentile_99_99": 844,
> "max": 844,
> "total_sum": 7833
> }
> {noformat}
> Prometheus-style counterpart:
> {noformat}
> # HELP kudu_master_handler_latency_kudu_master_MasterService_ListMasters
> Microseconds spent handling kudu.master.MasterService.ListMasters RPC requests
> # TYPE kudu_master_handler_latency_kudu_master_MasterService_ListMasters
> histogram
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
> le="0.75"} 324
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
> le="0.95"} 468
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
> le="0.99"} 844
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
> le="0.999"} 844
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
> le="0.9999"} 844
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
> le="+Inf"} 26
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_sum{unit_type="microseconds"}
> 7833
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_count{unit_type="microseconds"}
> 26
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)