[
https://issues.apache.org/jira/browse/HDDS-2300?focusedWorklogId=339883&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-339883
]
ASF GitHub Bot logged work on HDDS-2300:
----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Nov/19 11:15
Start Date: 07/Nov/19 11:15
Worklog Time Spent: 10m
Work Description: elek commented on pull request #127: HDDS-2300. Publish
normalized Ratis metrics via the prometheus endpoint
URL: https://github.com/apache/hadoop-ozone/pull/127
## What changes were proposed in this pull request?
Latest Ratis contains very good metrics about the status of the ratis ring.
After RATIS-702 it will be possible to adjust the reporter of the Dropwizard
based Ratis metrics and export them directly to the /prom HTTP endpoint (used
by ozone insight and ratis).
Unfortunately Dropwizard is very simple, there is no tag support. All of the
instance specific strings are part of the metric name. For example:
```
"ratis_grpc.log_appender.72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67@group"
+ "-72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67"
+ ".grpc_log_appender_follower_75fa730a-59f0-4547"
+ "-bd68-216162c263eb_latency",
```
In this patch I will use a simple method: during the export of the
dropwizard metrics based on the well-known format of the ratis metrics, they
are converted to proper prometheus metrics where the instance information is
included as tags:
```
ratis_grpc.log_appender.grpc_log_appender_follower_latency{instance="72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67"}
```
With this approach we can:
1. monitor easily all the Ratis pipelines with one simple query
2. Use the metrics for ozone insight which will show health state of the
Ratis pipeline
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-2300
## How was this patch tested?
1. Start an ozoneperf docker-compose cluster.
2. scale up to 3 datanodes
3. do something (eg. freon or key upload)
4. Check http://localhost:9090 prometheus endpoint
5. Check the metrics with 'ratis_' prefix. There shouldn't be any unique id
in the name of the metrics.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 339883)
Remaining Estimate: 0h
Time Spent: 10m
> Publish normalized Ratis metrics via the prometheus endpoint
> ------------------------------------------------------------
>
> Key: HDDS-2300
> URL: https://issues.apache.org/jira/browse/HDDS-2300
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Marton Elek
> Assignee: Marton Elek
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Latest Ratis contains very good metrics about the status of the ratis ring.
> After RATIS-702 it will be possible to adjust the repoter of the Dropwizard
> based ratis metrics and export them directly to the /prom http endpoint (used
> by ozone insight and ratis).
> Unfortunately Dropwizard is very simple, there is no tag support. All of the
> instance specific strings are part of the metric name. For example:
> {code:java}
> "ratis_grpc.log_appender.72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67@group"
> + "-72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67"
> + ".grpc_log_appender_follower_75fa730a-59f0-4547"
> + "-bd68-216162c263eb_latency", {code}
> In this patch I will use a simple method: during the export of the dropwizard
> metrics based on the well known format of the ratis metrics, they are
> converted to proper prometheus metrics where the instance information is
> included as tags:
> {code:java}
> ratis_grpc.log_appender.grpc_log_appender_follower_latency{instance="72caaf3a-fb1c-4da4-9cc0-a2ce21bb8e67"}
> {code}
> With this approach we can:
> 1. monitor easily all the Ratis pipelines with one simple query
> 2. Use the metrics for ozone insight which will show health state of the
> Ratis pipeline
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]