[ 
https://issues.apache.org/jira/browse/HDDS-1811?focusedWorklogId=278765&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278765
 ]

ASF GitHub Bot logged work on HDDS-1811:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Jul/19 06:40
            Start Date: 18/Jul/19 06:40
    Worklog Time Spent: 10m 
      Work Description: adoroszlai commented on pull request #1118: HDDS-1811. 
Prometheus metrics are broken
URL: https://github.com/apache/hadoop/pull/1118
 
 
   ## What changes were proposed in this pull request?
   
   Fix invalid metric type errors:
   
   ```
   target=http://192.168.69.76:9882/prom err="invalid metric type 
\"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
 gauge\""
   ```
   
   and
   
   ```
   target=http://scm:9876/prom err="invalid metric type 
\"_rati_s-_thre_e-d7116831-ac55-4bf2-a259-d85cfba0572d counter\""
   ```
   
    1. datanode: avoid `.` in record name by using simple class name
    2. SCM: replace `-` with `_`.  Also properly convert `ALL_CAPS` names, eg. 
`RATIS_THREE` to `ratis_three` instead of `_rati_s-_thre_e`.
   
   https://issues.apache.org/jira/browse/HDDS-1811
   
   ## How was this patch tested?
   
   Updated unit test.
   
   Checked metrics in `ozoneperf` pseudo-cluster.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 278765)
            Time Spent: 10m
    Remaining Estimate: 0h

> Prometheus metrics are broken for datanodes due to an invalid metric
> --------------------------------------------------------------------
>
>                 Key: HDDS-1811
>                 URL: https://issues.apache.org/jira/browse/HDDS-1811
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>            Reporter: Elek, Marton
>            Assignee: Doroszlai, Attila
>            Priority: Blocker
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Datanodes can't be monitored with prometheus any more:
> {code}
> level=warn ts=2019-07-16T16:29:55.876Z caller=scrape.go:937 component="scrape 
> manager" scrape_pool=pods target=http://192.168.69.76:9882/prom msg="append 
> failed" err="invalid metric type 
> \"apache.hadoop.ozone.container.common.transport.server.ratis._csm_metrics_delete_container_avg_time
>  gauge\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to