tonysun83 opened a new issue #3377:
URL: https://github.com/apache/couchdb/issues/3377


   ---
   name: Formal RFC
   about: Submit a formal Request For Comments for consideration by the team.
   title: Support for native Prometheus Endpoints
   labels: rfc, discussion
   assignees: @tonysun83 
   ---
   
   [NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
   
   # Introduction
   
   This is a formal proposal to add a `/_node/{node-name}/_metrics` endpoint 
that outputs https://prometheus.io/ metrics data. @garrensmith and @jiangphcn 
began this discussion in the mailing list and this proposal consolidates the 
list of options for this new endpoint. 
   
   
   ## Abstract
   
   Currently, CouchDB's metrics and diagnostic information can be obtained via 
node specific`_stats`, `_active_tasks`, and `_system` endpoints. Prometheus has 
become the more prevalent and standard approach for exposing metrics. Adding 
support to expose CouchDB metrics in Prometheus format is something to be 
desired as demonstrated by [CouchDB Prometheus 
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter). 
   
   One solution is to bundle the [CouchDB Prometheus 
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter) as part of 
CouchDB. This requires bundling GO as part of the build and also does not 
include `_system` info.
   
   The proposed solution is to add a native module or app that receives the 
`/_node/{node-name}/_metrics` call and consolidates the `_stats`, 
`_active_tasks`, and `_system` calls into one response. The default format will 
be JSON and an option will be provided for Prometheus. 
   
   ## Requirements Language
   
   [NOTE]: # ( Do not alter the section below. Follow its instructions. )
   
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in
   [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
   
   ## Terminology
   TBD - To Be Determined
   
   [TIP]:  # ( Provide a list of any unique terms or acronyms, and their 
definitions here.)
   
   ---
   
   # Detailed Description
   
   1) A new module, or perhaps a new app will be written to consolidate the 
results of our `_stats`, `active_task`, and `_system` internal function calls. 
The exact form is TBD. The JSON format should be straightforward since it is 
just aggregation of existing calls. However the Prometheus format requires more 
detailed implementation for conversion. See the [Http API](#HTTPAPI) for return 
format proposals.
   
   2) The endpoint will be node specific and not for the entire cluster. It's 
up to an external monitoring tool to aggregate and present the entire cluster's 
data. This the current design choice and is open for discussion. The exact 
endpoint is:
   `/_node/{node-name}/_metrics`. The endpoint will have an optional Accept 
Header that determines whether JSON or Prometheus output is returned.
   
   # Advantages and Disadvantages
   
   **Advantages**
   
   - Native functionality without relying on external converters.
   - No need to bundle GO as part of our release (we would need to bundle GO if 
we simply included [CouchDB Prometheus 
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter))
   - Ability to see all metrics info via one endpoint
   
   **Disadvantages** 
   - Prometheus scraping would require issuing a `_metrics` endpoint call for 
every node.
   - Re-implementation of functionality (_stats, _active_tasks) already 
available with [CouchDB Prometheus 
Exporter](https://github.com/gesellix/couchdb-prometheus-exporter)
   - A standard that is obsolete in the future.
   
   # Key Changes
   
   A new node specific endpoint will be added to the API.
   
   ## Applications and Modules affected
   
   1) chttpd
      - chttpd_node
      - chttpd_prometheus (new module)
   
   2) <couchdb-prometheus> (new app perhaps)
   
   <a name="HTTPAPI"></a> 
   ## HTTP API additions
   
   **GET** `/_node/{node-name}/_metrics` HTTP/1.1
   
   Returns consolidated metrics info (_stats, _active_tasks, _system) via JSON 
or Prometheus standard.
   
   **Request Headers: Headers:**
   If no header is provided, the default is JSON. Content-Type application/json 
will return JSON, while prometheus will return prometheus formatting.
   
   Accept:
   - application/json
   - prometheus
   
   **Response Headers:**
   
   Content-Type:
   - application/json
   - text/plain; charset=utf-8
   
   Valid Status Codes
   
   200 OK - Request completed successfully
   
   JSON Response Example
   ```
   [{
                "value": {
                        "min": 0,
                        "max": 0,
                        "arithmetic_mean": 0,
                        "geometric_mean": 0,
                        "harmonic_mean": 0,
                        "median": 0,
                        "variance": 0,
                        "standard_deviation": 0,
                        "percentile": [
                                [
                                        50,
                                        0
                                ],
                                [
                                        75,
                                        0
                                ],
                                [
                                        90,
                                        0
                                ],
                                [
                                        95,
                                        0
                                ],
                                [
                                        99,
                                        0
                                ],
                                [
                                        999,
                                        0
                                ]
                        ],
                        "histogram": [
                                [
                                        0,
                                        0
                                ]
                        ]
                }
        },
        {
                "node": "[email protected]",
                "pid": "<0.622.0>",
                "changes_done": 199,
                "current_version_stamp": "8131141649532-0-198",
                "database": "testdb",
                "db_version_stamp": "8131141649532-0-999",
                "design_document": "_design/example",
                "started_on": 1594703583,
                "type": "indexer",
                "updated_on": 1594703586
        },
        {
                "uptime": 259,
                "memory": {
                        ...
                }
        }
   ]
   ```
   
   Sample Prometheus Output:
   
   ```
   # TYPE couchdb_uptime_seconds counter
   couchdb_uptime_seconds 1
   # TYPE couchdb_erlang_memory_bytes gauge
   couchdb_erlang_memory_bytes{memory_type="total"} 71237784
   couchdb_erlang_memory_bytes{memory_type="processes"} 12248504
   couchdb_erlang_memory_bytes{memory_type="processes_used"} 12235928
   couchdb_erlang_memory_bytes{memory_type="system"} 58989280
   couchdb_erlang_memory_bytes{memory_type="atom"} 1172689
   couchdb_erlang_memory_bytes{memory_type="atom_used"} 1156575
   couchdb_erlang_memory_bytes{memory_type="binary"} 182568
   couchdb_erlang_memory_bytes{memory_type="code"} 27819083
   couchdb_erlang_memory_bytes{memory_type="ets"} 3143536
   # TYPE couchdb_erlang_gc_collections_total counter
   couchdb_erlang_gc_collections_total 13417
   # TYPE couchdb_erlang_gc_words_reclaimed_total counter
   couchdb_erlang_gc_words_reclaimed_total 71296018
   # TYPE couchdb_erlang_context_switches_total counter
   couchdb_erlang_context_switches_total 358276
   # TYPE couchdb_erlang_reductions_total counter
   couchdb_erlang_reductions_total 46527253
   # TYPE couchdb_erlang_processes gauge
   couchdb_erlang_processes 528
   # TYPE couchdb_erlang_process_limit gauge
   couchdb_erlang_process_limit 262144
   # TYPE couchdb_erlang_io_recv_bytes_total counter
   couchdb_erlang_io_recv_bytes_total 23291839
   # TYPE couchdb_erlang_io_sent_bytes_total counter
   couchdb_erlang_io_sent_bytes_total 8915261
   # TYPE couchdb_erlang_message_queues gauge
   couchdb_erlang_message_queues 0
   # TYPE couchdb_erlang_message_queue_min gauge
   couchdb_active_task{type="replication", source="mailbox", 
target="http://mailsrv:5984/mailbox <http://mailsrv:5984/mailbox>", docs_count 
= "docs_read"} 4524
   couchdb_active_task{type="replication", source="mailbox", 
target="http://mailsrv:5984/mailbox <http://mailsrv:5984/mailbox>", docs_count 
= "docs_written"} 4524
   couchdb_active_task{type="replication", source="mailbox", 
target="http://mailsrv:5984/mailbox <http://mailsrv:5984/mailbox>", docs_count 
= "missing_revisions_found"} 4524
   ```
   
   ## HTTP API deprecations
   
   None
   
   # Security Considerations
   
   N/A
   
   # References
   
   
http://couchdb-development.1959287.n2.nabble.com/DISCUSS-Prometheus-endpoint-in-CouchDB-4-x-td7607648.html
   
   # Acknowledgements
   Summary and implementation ideas mostly from the mailing list discussion 
responses.
   
   @jiangphcn @garrensmith @janl @wohali @davisp @willholley 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to