tonysun83 opened a new issue #3377:
URL: https://github.com/apache/couchdb/issues/3377
---
name: Formal RFC
about: Submit a formal Request For Comments for consideration by the team.
title: Support for native Prometheus Endpoints
labels: rfc, discussion
assignees: @tonysun83
---
[NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
# Introduction
This is a formal proposal to add a `/_node/{node-name}/_metrics` endpoint
that outputs https://prometheus.io/ metrics data. @garrensmith and @jiangphcn
began this discussion in the mailing list and this proposal consolidates the
list of options for this new endpoint.
## Abstract
Currently, CouchDB's metrics and diagnostic information can be obtained via
node specific`_stats`, `_active_tasks`, and `_system` endpoints. Prometheus has
become the more prevalent and standard approach for exposing metrics. Adding
support to expose CouchDB metrics in Prometheus format is something to be
desired as demonstrated by [CouchDB Prometheus
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter).
One solution is to bundle the [CouchDB Prometheus
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter) as part of
CouchDB. This requires bundling GO as part of the build and also does not
include `_system` info.
The proposed solution is to add a native module or app that receives the
`/_node/{node-name}/_metrics` call and consolidates the `_stats`,
`_active_tasks`, and `_system` calls into one response. The default format will
be JSON and an option will be provided for Prometheus.
## Requirements Language
[NOTE]: # ( Do not alter the section below. Follow its instructions. )
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in
[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
## Terminology
TBD - To Be Determined
[TIP]: # ( Provide a list of any unique terms or acronyms, and their
definitions here.)
---
# Detailed Description
1) A new module, or perhaps a new app will be written to consolidate the
results of our `_stats`, `active_task`, and `_system` internal function calls.
The exact form is TBD. The JSON format should be straightforward since it is
just aggregation of existing calls. However the Prometheus format requires more
detailed implementation for conversion. See the [Http API](#HTTPAPI) for return
format proposals.
2) The endpoint will be node specific and not for the entire cluster. It's
up to an external monitoring tool to aggregate and present the entire cluster's
data. This the current design choice and is open for discussion. The exact
endpoint is:
`/_node/{node-name}/_metrics`. The endpoint will have an optional Accept
Header that determines whether JSON or Prometheus output is returned.
# Advantages and Disadvantages
**Advantages**
- Native functionality without relying on external converters.
- No need to bundle GO as part of our release (we would need to bundle GO if
we simply included [CouchDB Prometheus
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter))
- Ability to see all metrics info via one endpoint
**Disadvantages**
- Prometheus scraping would require issuing a `_metrics` endpoint call for
every node.
- Re-implementation of functionality (_stats, _active_tasks) already
available with [CouchDB Prometheus
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter)
- A standard that is obsolete in the future.
# Key Changes
A new node specific endpoint will be added to the API.
## Applications and Modules affected
1) chttpd
- chttpd_node
- chttpd_prometheus (new module)
2) <couchdb-prometheus> (new app perhaps)
<a name="HTTPAPI"></a>
## HTTP API additions
**GET** `/_node/{node-name}/_metrics` HTTP/1.1
Returns consolidated metrics info (_stats, _active_tasks, _system) via JSON
or Prometheus standard.
**Request Headers: Headers:**
If no header is provided, the default is JSON. Content-Type application/json
will return JSON, while prometheus will return prometheus formatting.
Accept:
- application/json
- prometheus
**Response Headers:**
Content-Type:
- application/json
- text/plain; charset=utf-8
Valid Status Codes
200 OK - Request completed successfully
JSON Response Example
```
[{
"value": {
"min": 0,
"max": 0,
"arithmetic_mean": 0,
"geometric_mean": 0,
"harmonic_mean": 0,
"median": 0,
"variance": 0,
"standard_deviation": 0,
"percentile": [
[
50,
0
],
[
75,
0
],
[
90,
0
],
[
95,
0
],
[
99,
0
],
[
999,
0
]
],
"histogram": [
[
0,
0
]
]
}
},
{
"node": "[email protected]",
"pid": "<0.622.0>",
"changes_done": 199,
"current_version_stamp": "8131141649532-0-198",
"database": "testdb",
"db_version_stamp": "8131141649532-0-999",
"design_document": "_design/example",
"started_on": 1594703583,
"type": "indexer",
"updated_on": 1594703586
},
{
"uptime": 259,
"memory": {
...
}
}
]
```
Sample Prometheus Output:
```
# TYPE couchdb_uptime_seconds counter
couchdb_uptime_seconds 1
# TYPE couchdb_erlang_memory_bytes gauge
couchdb_erlang_memory_bytes{memory_type="total"} 71237784
couchdb_erlang_memory_bytes{memory_type="processes"} 12248504
couchdb_erlang_memory_bytes{memory_type="processes_used"} 12235928
couchdb_erlang_memory_bytes{memory_type="system"} 58989280
couchdb_erlang_memory_bytes{memory_type="atom"} 1172689
couchdb_erlang_memory_bytes{memory_type="atom_used"} 1156575
couchdb_erlang_memory_bytes{memory_type="binary"} 182568
couchdb_erlang_memory_bytes{memory_type="code"} 27819083
couchdb_erlang_memory_bytes{memory_type="ets"} 3143536
# TYPE couchdb_erlang_gc_collections_total counter
couchdb_erlang_gc_collections_total 13417
# TYPE couchdb_erlang_gc_words_reclaimed_total counter
couchdb_erlang_gc_words_reclaimed_total 71296018
# TYPE couchdb_erlang_context_switches_total counter
couchdb_erlang_context_switches_total 358276
# TYPE couchdb_erlang_reductions_total counter
couchdb_erlang_reductions_total 46527253
# TYPE couchdb_erlang_processes gauge
couchdb_erlang_processes 528
# TYPE couchdb_erlang_process_limit gauge
couchdb_erlang_process_limit 262144
# TYPE couchdb_erlang_io_recv_bytes_total counter
couchdb_erlang_io_recv_bytes_total 23291839
# TYPE couchdb_erlang_io_sent_bytes_total counter
couchdb_erlang_io_sent_bytes_total 8915261
# TYPE couchdb_erlang_message_queues gauge
couchdb_erlang_message_queues 0
# TYPE couchdb_erlang_message_queue_min gauge
couchdb_active_task{type="replication", source="mailbox",
target="http://mailsrv:5984/mailbox <http://mailsrv:5984/mailbox>", docs_count
= "docs_read"} 4524
couchdb_active_task{type="replication", source="mailbox",
target="http://mailsrv:5984/mailbox <http://mailsrv:5984/mailbox>", docs_count
= "docs_written"} 4524
couchdb_active_task{type="replication", source="mailbox",
target="http://mailsrv:5984/mailbox <http://mailsrv:5984/mailbox>", docs_count
= "missing_revisions_found"} 4524
```
## HTTP API deprecations
None
# Security Considerations
N/A
# References
http://couchdb-development.1959287.n2.nabble.com/DISCUSS-Prometheus-endpoint-in-CouchDB-4-x-td7607648.html
# Acknowledgements
Summary and implementation ideas mostly from the mailing list discussion
responses.
@jiangphcn @garrensmith @janl @wohali @davisp @willholley
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]