Hi, chiming in as maintainer of the already mentioned https://github.com/gesellix/couchdb-prometheus-exporter <https://github.com/gesellix/couchdb-prometheus-exporter>.
My impression would be to consolidate the existing endpoints first (maybe at /_metrics, because /_info sounds too informal), which would make high frequent scrapes more efficient. The current approach of the CouchDB-Prometheus-Exporter doesn’t feel right, because every node * every stats/system/active_tasks endpoint needs to be queried on each scrape. That endpoint should certainly be able to provide JSON format by default, which would already help a lot to improve the existing Prometheus exporter. Content negotiation via Accept header would be nice to respond with the Prometheus specific format. I wouldn’t prefer the workaround with the request parameter, though. Without too much knowledge about CouchDB internals: I’d suggest yet another endpoint /_prometheus which would provide text/plain (prometheus formatted) content by default. That endpoint could internally “delegate” to the /_metrics endpoint. While I might be biased, and knowing that other frameworks/tools already provide Prometheus stats out of the box, I personally tend to keep things separated. From an operational perspective it would be great to _not_ have to co-locate CouchDB with a sidecar-exporter, but on the contrary it would also be great if I could perform upgrades or configuration separately. Best Tobias > On 23. Sep 2020, at 22:43, Robert Samuel Newson <rnew...@apache.org> wrote: > > Hi, > > I don't see why this can't be a new endpoint (emitting the normal Prometheus > format) that couchdb administrators can choose to enable (and leave it > disabled by default, returning a 404). > > I agree with the general view that content type negotiation doesn't really > work well in practice, and I don't much like the suggested ?accept= hack. > > I am old and world-weary and have seen these sorts of things come and go many > times. Prometheus seems a fine option for now, and perhaps for a while, but > it feels like a plugin, not core, to me. > > B. > >> On 23 Sep 2020, at 17:25, Richard Ellis <ricel...@uk.ibm.com> wrote: >> >>> so we should absolutely make this info available in JSON >> >> This sounds like a good idea to me >> >>> we could fall back to a ?accept=prometheus option >> >> I'm opposed to adding endpoints that supply different content-type >> responses via non-standard means. The CouchDB API has some examples of >> this through history and it can make using those endpoints with standard >> tooling somewhat painful. >> >> A bit of quick searching seems to suggest that the format has its own >> project https://openmetrics.io/ - and this declares it's text >> representation linking back to >> https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format >> >> which declares a Content-Type of "text/plain; version=0.0.4" - so >> defaulting to that, but following Joan's suggestion and switching to JSON >> for a supplied Accept:application/json in the standard way seems a like >> good choice to me. >> >> Rich >> >> >> >> From: Jan Lehnardt <j...@apache.org> >> To: dev@couchdb.apache.org >> Cc: "Gesellchen, Tobias" <tobias.gesellc...@europace.de> >> Date: 23/09/2020 16:42 >> Subject: [EXTERNAL] Re: [DISCUSS] Prometheus endpoint in CouchDB >> 4.x >> >> >> >> Hi all, >> >> a few things to consider: >> >> 1. The idea of unifying our “get runtime info about CouchDB” endpoints >> into one is solid, as it is always weird to make sure you know which info >> you get where. We see this specifically in support engagements, where it >> is always awkward to ask for the results of multiple endpoints. >> >> 2. This directly leads to the question about what the endpoint should be >> called. I feel if it is a new endpoint, we should give it a new name. >> _info maybe, but feel free to bike shed away. >> >> 3. Next the question about per-node and per-cluster info/metrics/activity >> on the endpoint. It might be convenient to be able to ask any one node >> about what is going on in the entire cluster, rather than any one node, >> but some stats only make sense in the context of a single node. Maybe the >> result includes everything separated by node somehow. >> >> 4. Then the format: if this wasn’t about Prometheus and its custom format, >> we wouldn’t discuss any of this and just use JSON. Since we *do* want to >> target Prometheus with this, we have to talk about the format. Any of the >> above is useful for non-Prometheus consumers, so we should absolutely make >> this info available in JSON. And we can *also* send it in the Prometheus >> format. The “correct” HTTP-way of doing this would be to use the Accept >> header on the new endpoint, as Joan points out, but that’s often not an >> option, so we could fall back to a ?accept=prometheus option. This would >> also leave us open to add more formats in the future, as new standards >> arise. >> >> 5. That leads us to whether we want to do this. Every five or so years, >> new standards for these types of systems arise, and sometimes it is worth >> incorporating them (like we finally do with the SystemD compatible log >> formatter) and sometimes it is not and folks write tools to convert from >> our HTTP/JSON standard to whatever they need ( >> https://github.com/gesellix/couchdb-prometheus-exporter >> ) >> >> 6. We could also just bundle this exporter (although it is written in Go, >> which we currently don’t have as a dependency. >> >> * * * >> >> Personally, I think the Prometheus format is widely enough used to warrant >> inclusion, as long as we do it tastefully. I think a new endpoint with an >> additional ?accept= or similar URL-level override for the format would be >> a pragmatic, if not entirely *neat* approach. If we can build this all in >> Erlang, the better, if we wanna shortcut dev time and bundle the Go >> project, I might be more hesitant. On the per-node-or-per-cluster >> question, I don’t know enough about the Prometheus format and whether it >> allows us to send the equivalent of {nodes: { “node1”: {…}, “node2”: {…}, >> “node3”: {…} }}, or whether it demands per-node output, in which case >> _active_tasks might get a bit awkward. >> >> Best >> Jan >> >> — >> Professional Support for Apache CouchDB: >> https://neighbourhood.ie/couchdb-support/ >> >> >> 24/7 Observation for your CouchDB Instances: >> https://opservatory.app >> >> >>> On 22. Sep 2020, at 14:55, jiangph <jiangpeng...@hotmail.com> wrote: >>> >>> Hey all, >>> >>> We would like to add a Prometheus metrics endpoint for CouchDB and >> wanted to see if the community would be interested in us contributing this >> to CouchDB 4.x. >>> >>> Prometheus is a CNCF open-source project and the Prometheus metrics >> endpoint format is supported by many monitoring tools. Its data model is >> based around having a metric name which then contains a label name and a >> label value: >>> >>> <metric name>{<label name>=<label value>, ...} >>> >>> And it supports the Counter, Gauge, Histogram, and Summary metric types. >> >>> >>> The idea for the new Prometheus endpoint, /_metrics, would be that the >> endpoint is a consolidation of the _stats [1], _system [2], and >> _active_tasks [3] endpoints. >>> >>> For _stats and _system, the conversion from JSON to Prometheus-based >> format seems to be straightforward. >>> >>> JSON format: >>> { >>> "value": { >>> "min": 0, >>> "max": 0, >>> "arithmetic_mean": 0, >>> "geometric_mean": 0, >>> "harmonic_mean": 0, >>> "median": 0, >>> "variance": 0, >>> "standard_deviation": 0, >>> ... >>> "percentile": [ >>> [ >>> 50, >>> 0 >>> ], >>> [ >>> 75, >>> 0 >>> ], >>> [ >>> 90, >>> 0 >>> ], >>> [ >>> 95, >>> 0 >>> ], >>> [ >>> 99, >>> 0 >>> ], >>> [ >>> 999, >>> 0 >>> ] >>> ], >>> "histogram": [ >>> [ >>> 0, >>> 0 >>> ] >>> ], >>> } >>> >>> Prometheus-based format: >>> >>> couchdb_stats{value="min"} 0 >>> couchdb_stats{value="max"} 0 >>> couchdb_stats{value="percentile50"} 0 >>> couchdb_stats{value="percentile75"} 0 >>> couchdb_stats{value="percentile95"} 0 >>> >>> For _active_tasks, the change will be a bit more complicated, and some >> fields will be added to labels and tags. >>> >>> JSON format: >>> >>> { >>> "checkpointed_source_seq": 68585, >>> "continuous": false, >>> "doc_id": null, >>> "doc_write_failures": 0, >>> "docs_read": 4524, >>> "docs_written": 4524, >>> "missing_revisions_found": 4524, >>> "pid": "<0.1538.5>", >>> "progress": 44, >>> "replication_id": "9bc1727d74d49d9e157e260bb8bbd1d5", >>> "revisions_checked": 4524, >>> "source": "mailbox", >>> "source_seq": 154419, >>> "started_on": 1376116644, >>> "target": " >> http://mailsrv:5984/mailbox >> < >> http://mailsrv:5984/mailbox >>> ", >>> "type": "replication", >>> "updated_on": 1376116651 >>> } >>> >>> Prometheus-based would look something like: >>> >>> format:couchdb_active_task{type="replication", source="mailbox", >> target=" >> http://mailsrv:5984/mailbox >> < >> http://mailsrv:5984/mailbox >>> ", docs_count = "docs_read"} 4524 >>> couchdb_active_task{type="replication", source="mailbox", target=" >> http://mailsrv:5984/mailbox >> < >> http://mailsrv:5984/mailbox >>> ", docs_count = "docs_written"} 4524 >>> couchdb_active_task{type="replication", source="mailbox", target=" >> http://mailsrv:5984/mailbox >> < >> http://mailsrv:5984/mailbox >>> ", docs_count = "missing_revisions_found"} 4524 >>> >>> >>> Best regards, >>> Garren Smith >>> Peng Hui Jiang >>> >>> [1] >> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats >> >> < >> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats >> >>> >>> [2] >> https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks >> < >> https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks >>> >>> [3] >> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system >> >> < >> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system >> >>> >> >> >> >> >> >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with number >> 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> >