Hi all, Here is the PR for the prometheus endpoint: https://github.com/apache/couchdb/pull/3416. Would love to get some eyes on it.
Thanks, Tony On Tue, Sep 29, 2020 at 8:52 AM Mike Rhodes <couc...@dx13.co.uk> wrote: > For me, I think Bob's approach below, combined with ensuring the overhead > of coding the endpoint is low by setting up a combined metrics source that > both Prometheus and the JSON-based metrics endpoint would use feels > cleanest -- one internal API and formatting based on the endpoint called. > Based on Will's comments, this would need to do the work to format things > sensibly for Prometheus and also be tightly coupled enough to CouchDB to > process the underlying histograms into Prometheus consumable ones. > > I think this maintains CouchDB's "JSON by default" stance neatly while > avoiding what feel like undesirable ways to specify the content type to > workaround Prometheus client or spec weaknesses. I also think that this > works well in terms of future proofing -- for me it's cleaner to remove an > entire (optionally enabled) endpoint than an option on an existing > endpoint. > > Hopefully it is also easier from a client library perspective; is it? > > -- > Mike. > > On Wed, 23 Sep 2020, at 21:43, Robert Samuel Newson wrote: > > Hi, > > > > I don't see why this can't be a new endpoint (emitting the normal > > Prometheus format) that couchdb administrators can choose to enable > > (and leave it disabled by default, returning a 404). > > > > I agree with the general view that content type negotiation doesn't > > really work well in practice, and I don't much like the suggested > > ?accept= hack. > > > > I am old and world-weary and have seen these sorts of things come and > > go many times. Prometheus seems a fine option for now, and perhaps for > > a while, but it feels like a plugin, not core, to me. > > > > B. > > > > > On 23 Sep 2020, at 17:25, Richard Ellis <ricel...@uk.ibm.com> wrote: > > > > > >> so we should absolutely make this info available in JSON > > > > > > This sounds like a good idea to me > > > > > >> we could fall back to a ?accept=prometheus option > > > > > > I'm opposed to adding endpoints that supply different content-type > > > responses via non-standard means. The CouchDB API has some examples of > > > this through history and it can make using those endpoints with > standard > > > tooling somewhat painful. > > > > > > A bit of quick searching seems to suggest that the format has its own > > > project https://openmetrics.io/ - and this declares it's text > > > representation linking back to > > > > https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format > > > which declares a Content-Type of "text/plain; version=0.0.4" - so > > > defaulting to that, but following Joan's suggestion and switching to > JSON > > > for a supplied Accept:application/json in the standard way seems a > like > > > good choice to me. > > > > > > Rich > > > > > > > > > > > > From: Jan Lehnardt <j...@apache.org> > > > To: dev@couchdb.apache.org > > > Cc: "Gesellchen, Tobias" <tobias.gesellc...@europace.de> > > > Date: 23/09/2020 16:42 > > > Subject: [EXTERNAL] Re: [DISCUSS] Prometheus endpoint in > CouchDB > > > 4.x > > > > > > > > > > > > Hi all, > > > > > > a few things to consider: > > > > > > 1. The idea of unifying our “get runtime info about CouchDB” endpoints > > > into one is solid, as it is always weird to make sure you know which > info > > > you get where. We see this specifically in support engagements, where > it > > > is always awkward to ask for the results of multiple endpoints. > > > > > > 2. This directly leads to the question about what the endpoint should > be > > > called. I feel if it is a new endpoint, we should give it a new name. > > > _info maybe, but feel free to bike shed away. > > > > > > 3. Next the question about per-node and per-cluster > info/metrics/activity > > > on the endpoint. It might be convenient to be able to ask any one node > > > about what is going on in the entire cluster, rather than any one > node, > > > but some stats only make sense in the context of a single node. Maybe > the > > > result includes everything separated by node somehow. > > > > > > 4. Then the format: if this wasn’t about Prometheus and its custom > format, > > > we wouldn’t discuss any of this and just use JSON. Since we *do* want > to > > > target Prometheus with this, we have to talk about the format. Any of > the > > > above is useful for non-Prometheus consumers, so we should absolutely > make > > > this info available in JSON. And we can *also* send it in the > Prometheus > > > format. The “correct” HTTP-way of doing this would be to use the > Accept > > > header on the new endpoint, as Joan points out, but that’s often not > an > > > option, so we could fall back to a ?accept=prometheus option. This > would > > > also leave us open to add more formats in the future, as new standards > > > arise. > > > > > > 5. That leads us to whether we want to do this. Every five or so > years, > > > new standards for these types of systems arise, and sometimes it is > worth > > > incorporating them (like we finally do with the SystemD compatible log > > > formatter) and sometimes it is not and folks write tools to convert > from > > > our HTTP/JSON standard to whatever they need ( > > > https://github.com/gesellix/couchdb-prometheus-exporter > > > ) > > > > > > 6. We could also just bundle this exporter (although it is written in > Go, > > > which we currently don’t have as a dependency. > > > > > > * * * > > > > > > Personally, I think the Prometheus format is widely enough used to > warrant > > > inclusion, as long as we do it tastefully. I think a new endpoint with > an > > > additional ?accept= or similar URL-level override for the format would > be > > > a pragmatic, if not entirely *neat* approach. If we can build this all > in > > > Erlang, the better, if we wanna shortcut dev time and bundle the Go > > > project, I might be more hesitant. On the per-node-or-per-cluster > > > question, I don’t know enough about the Prometheus format and whether > it > > > allows us to send the equivalent of {nodes: { “node1”: {…}, “node2”: > {…}, > > > “node3”: {…} }}, or whether it demands per-node output, in which case > > > _active_tasks might get a bit awkward. > > > > > > Best > > > Jan > > > > > > — > > > Professional Support for Apache CouchDB: > > > https://neighbourhood.ie/couchdb-support/ > > > > > > > > > 24/7 Observation for your CouchDB Instances: > > > https://opservatory.app > > > > > > > > >> On 22. Sep 2020, at 14:55, jiangph <jiangpeng...@hotmail.com> wrote: > > >> > > >> Hey all, > > >> > > >> We would like to add a Prometheus metrics endpoint for CouchDB and > > > wanted to see if the community would be interested in us contributing > this > > > to CouchDB 4.x. > > >> > > >> Prometheus is a CNCF open-source project and the Prometheus metrics > > > endpoint format is supported by many monitoring tools. Its data model > is > > > based around having a metric name which then contains a label name and > a > > > label value: > > >> > > >> <metric name>{<label name>=<label value>, ...} > > >> > > >> And it supports the Counter, Gauge, Histogram, and Summary metric > types. > > > > > >> > > >> The idea for the new Prometheus endpoint, /_metrics, would be that > the > > > endpoint is a consolidation of the _stats [1], _system [2], and > > > _active_tasks [3] endpoints. > > >> > > >> For _stats and _system, the conversion from JSON to Prometheus-based > > > format seems to be straightforward. > > >> > > >> JSON format: > > >> { > > >> "value": { > > >> "min": 0, > > >> "max": 0, > > >> "arithmetic_mean": 0, > > >> "geometric_mean": 0, > > >> "harmonic_mean": 0, > > >> "median": 0, > > >> "variance": 0, > > >> "standard_deviation": 0, > > >> ... > > >> "percentile": [ > > >> [ > > >> 50, > > >> 0 > > >> ], > > >> [ > > >> 75, > > >> 0 > > >> ], > > >> [ > > >> 90, > > >> 0 > > >> ], > > >> [ > > >> 95, > > >> 0 > > >> ], > > >> [ > > >> 99, > > >> 0 > > >> ], > > >> [ > > >> 999, > > >> 0 > > >> ] > > >> ], > > >> "histogram": [ > > >> [ > > >> 0, > > >> 0 > > >> ] > > >> ], > > >> } > > >> > > >> Prometheus-based format: > > >> > > >> couchdb_stats{value="min"} 0 > > >> couchdb_stats{value="max"} 0 > > >> couchdb_stats{value="percentile50"} 0 > > >> couchdb_stats{value="percentile75"} 0 > > >> couchdb_stats{value="percentile95"} 0 > > >> > > >> For _active_tasks, the change will be a bit more complicated, and > some > > > fields will be added to labels and tags. > > >> > > >> JSON format: > > >> > > >> { > > >> "checkpointed_source_seq": 68585, > > >> "continuous": false, > > >> "doc_id": null, > > >> "doc_write_failures": 0, > > >> "docs_read": 4524, > > >> "docs_written": 4524, > > >> "missing_revisions_found": 4524, > > >> "pid": "<0.1538.5>", > > >> "progress": 44, > > >> "replication_id": "9bc1727d74d49d9e157e260bb8bbd1d5", > > >> "revisions_checked": 4524, > > >> "source": "mailbox", > > >> "source_seq": 154419, > > >> "started_on": 1376116644, > > >> "target": " > > > http://mailsrv:5984/mailbox > > > < > > > http://mailsrv:5984/mailbox > > >> ", > > >> "type": "replication", > > >> "updated_on": 1376116651 > > >> } > > >> > > >> Prometheus-based would look something like: > > >> > > >> format:couchdb_active_task{type="replication", source="mailbox", > > > target=" > > > http://mailsrv:5984/mailbox > > > < > > > http://mailsrv:5984/mailbox > > >> ", docs_count = "docs_read"} 4524 > > >> couchdb_active_task{type="replication", source="mailbox", target=" > > > http://mailsrv:5984/mailbox > > > < > > > http://mailsrv:5984/mailbox > > >> ", docs_count = "docs_written"} 4524 > > >> couchdb_active_task{type="replication", source="mailbox", target=" > > > http://mailsrv:5984/mailbox > > > < > > > http://mailsrv:5984/mailbox > > >> ", docs_count = "missing_revisions_found"} 4524 > > >> > > >> > > >> Best regards, > > >> Garren Smith > > >> Peng Hui Jiang > > >> > > >> [1] > > > > https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats > > > < > > > > https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats > > >> > > >> [2] > > > https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks > > > < > > > https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks > > >> > > >> [3] > > > > https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system > > > < > > > > https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system > > >> > > > > > > > > > > > > > > > > > > Unless stated otherwise above: > > > IBM United Kingdom Limited - Registered in England and Wales with > number > > > 741598. > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 > 3AU > > > > > > > >