>so we should absolutely make this info available in JSON This sounds like a good idea to me
>we could fall back to a ?accept=prometheus option I'm opposed to adding endpoints that supply different content-type responses via non-standard means. The CouchDB API has some examples of this through history and it can make using those endpoints with standard tooling somewhat painful. A bit of quick searching seems to suggest that the format has its own project https://openmetrics.io/ - and this declares it's text representation linking back to https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format which declares a Content-Type of "text/plain; version=0.0.4" - so defaulting to that, but following Joan's suggestion and switching to JSON for a supplied Accept:application/json in the standard way seems a like good choice to me. Rich From: Jan Lehnardt <j...@apache.org> To: dev@couchdb.apache.org Cc: "Gesellchen, Tobias" <tobias.gesellc...@europace.de> Date: 23/09/2020 16:42 Subject: [EXTERNAL] Re: [DISCUSS] Prometheus endpoint in CouchDB 4.x Hi all, a few things to consider: 1. The idea of unifying our “get runtime info about CouchDB” endpoints into one is solid, as it is always weird to make sure you know which info you get where. We see this specifically in support engagements, where it is always awkward to ask for the results of multiple endpoints. 2. This directly leads to the question about what the endpoint should be called. I feel if it is a new endpoint, we should give it a new name. _info maybe, but feel free to bike shed away. 3. Next the question about per-node and per-cluster info/metrics/activity on the endpoint. It might be convenient to be able to ask any one node about what is going on in the entire cluster, rather than any one node, but some stats only make sense in the context of a single node. Maybe the result includes everything separated by node somehow. 4. Then the format: if this wasn’t about Prometheus and its custom format, we wouldn’t discuss any of this and just use JSON. Since we *do* want to target Prometheus with this, we have to talk about the format. Any of the above is useful for non-Prometheus consumers, so we should absolutely make this info available in JSON. And we can *also* send it in the Prometheus format. The “correct” HTTP-way of doing this would be to use the Accept header on the new endpoint, as Joan points out, but that’s often not an option, so we could fall back to a ?accept=prometheus option. This would also leave us open to add more formats in the future, as new standards arise. 5. That leads us to whether we want to do this. Every five or so years, new standards for these types of systems arise, and sometimes it is worth incorporating them (like we finally do with the SystemD compatible log formatter) and sometimes it is not and folks write tools to convert from our HTTP/JSON standard to whatever they need ( https://github.com/gesellix/couchdb-prometheus-exporter ) 6. We could also just bundle this exporter (although it is written in Go, which we currently don’t have as a dependency. * * * Personally, I think the Prometheus format is widely enough used to warrant inclusion, as long as we do it tastefully. I think a new endpoint with an additional ?accept= or similar URL-level override for the format would be a pragmatic, if not entirely *neat* approach. If we can build this all in Erlang, the better, if we wanna shortcut dev time and bundle the Go project, I might be more hesitant. On the per-node-or-per-cluster question, I don’t know enough about the Prometheus format and whether it allows us to send the equivalent of {nodes: { “node1”: {…}, “node2”: {…}, “node3”: {…} }}, or whether it demands per-node output, in which case _active_tasks might get a bit awkward. Best Jan — Professional Support for Apache CouchDB: https://neighbourhood.ie/couchdb-support/ 24/7 Observation for your CouchDB Instances: https://opservatory.app > On 22. Sep 2020, at 14:55, jiangph <jiangpeng...@hotmail.com> wrote: > > Hey all, > > We would like to add a Prometheus metrics endpoint for CouchDB and wanted to see if the community would be interested in us contributing this to CouchDB 4.x. > > Prometheus is a CNCF open-source project and the Prometheus metrics endpoint format is supported by many monitoring tools. Its data model is based around having a metric name which then contains a label name and a label value: > > <metric name>{<label name>=<label value>, ...} > > And it supports the Counter, Gauge, Histogram, and Summary metric types. > > The idea for the new Prometheus endpoint, /_metrics, would be that the endpoint is a consolidation of the _stats [1], _system [2], and _active_tasks [3] endpoints. > > For _stats and _system, the conversion from JSON to Prometheus-based format seems to be straightforward. > > JSON format: > { > "value": { > "min": 0, > "max": 0, > "arithmetic_mean": 0, > "geometric_mean": 0, > "harmonic_mean": 0, > "median": 0, > "variance": 0, > "standard_deviation": 0, > ... > "percentile": [ > [ > 50, > 0 > ], > [ > 75, > 0 > ], > [ > 90, > 0 > ], > [ > 95, > 0 > ], > [ > 99, > 0 > ], > [ > 999, > 0 > ] > ], > "histogram": [ > [ > 0, > 0 > ] > ], > } > > Prometheus-based format: > > couchdb_stats{value="min"} 0 > couchdb_stats{value="max"} 0 > couchdb_stats{value="percentile50"} 0 > couchdb_stats{value="percentile75"} 0 > couchdb_stats{value="percentile95"} 0 > > For _active_tasks, the change will be a bit more complicated, and some fields will be added to labels and tags. > > JSON format: > > { > "checkpointed_source_seq": 68585, > "continuous": false, > "doc_id": null, > "doc_write_failures": 0, > "docs_read": 4524, > "docs_written": 4524, > "missing_revisions_found": 4524, > "pid": "<0.1538.5>", > "progress": 44, > "replication_id": "9bc1727d74d49d9e157e260bb8bbd1d5", > "revisions_checked": 4524, > "source": "mailbox", > "source_seq": 154419, > "started_on": 1376116644, > "target": " http://mailsrv:5984/mailbox < http://mailsrv:5984/mailbox >", > "type": "replication", > "updated_on": 1376116651 > } > > Prometheus-based would look something like: > > format:couchdb_active_task{type="replication", source="mailbox", target=" http://mailsrv:5984/mailbox < http://mailsrv:5984/mailbox >", docs_count = "docs_read"} 4524 > couchdb_active_task{type="replication", source="mailbox", target=" http://mailsrv:5984/mailbox < http://mailsrv:5984/mailbox >", docs_count = "docs_written"} 4524 > couchdb_active_task{type="replication", source="mailbox", target=" http://mailsrv:5984/mailbox < http://mailsrv:5984/mailbox >", docs_count = "missing_revisions_found"} 4524 > > > Best regards, > Garren Smith > Peng Hui Jiang > > [1] https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats < https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats > > [2] https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks < https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks > > [3] https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system < https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU