Re: [DISCUSS] Prometheus endpoint in CouchDB 4.x

Tobias Gesellchen Wed, 23 Sep 2020 14:11:14 -0700

Hi,

chiming in as maintainer of the already mentioned 
https://github.com/gesellix/couchdb-prometheus-exporter 
<https://github.com/gesellix/couchdb-prometheus-exporter>.


My impression would be to consolidate the existing endpoints first (maybe at 
/_metrics, because /_info sounds too informal), which would make high frequent 
scrapes more efficient. The current approach of the CouchDB-Prometheus-Exporter 
doesn’t feel right, because every node * every stats/system/active_tasks 
endpoint needs to be queried on each scrape. That endpoint should certainly be 
able to provide JSON format by default, which would already help a lot to 
improve the existing Prometheus exporter.

Content negotiation via Accept header would be nice to respond with the 
Prometheus specific format. I wouldn’t prefer the workaround with the request 
parameter, though. Without too much knowledge about CouchDB internals: I’d 
suggest yet another endpoint /_prometheus which would provide text/plain 
(prometheus formatted) content by default. That endpoint could internally 
“delegate” to the /_metrics endpoint.

While I might be biased, and knowing that other frameworks/tools already 
provide Prometheus stats out of the box, I personally tend to keep things 
separated. From an operational perspective it would be great to _not_ have to 
co-locate CouchDB with a sidecar-exporter, but on the contrary it would also be 
great if I could perform upgrades or configuration separately.

Best
Tobias



> On 23. Sep 2020, at 22:43, Robert Samuel Newson <rnew...@apache.org> wrote:
> 
> Hi,
> 
> I don't see why this can't be a new endpoint (emitting the normal Prometheus 
> format) that couchdb administrators can choose to enable (and leave it 
> disabled by default, returning a 404).
> 
> I agree with the general view that content type negotiation doesn't really 
> work well in practice, and I don't much like the suggested ?accept= hack.
> 
> I am old and world-weary and have seen these sorts of things come and go many 
> times. Prometheus seems a fine option for now, and perhaps for a while, but 
> it feels like a plugin, not core, to me.
> 
> B.
> 
>> On 23 Sep 2020, at 17:25, Richard Ellis <ricel...@uk.ibm.com> wrote:
>> 
>>> so we should absolutely make this info available in JSON
>> 
>> This sounds like a good idea to me
>> 
>>> we could fall back to a ?accept=prometheus option
>> 
>> I'm opposed to adding endpoints that supply different content-type 
>> responses via non-standard means. The CouchDB API has some examples of 
>> this through history and it can make using those endpoints with standard 
>> tooling somewhat painful.
>> 
>> A bit of quick searching seems to suggest that the format has its own 
>> project https://openmetrics.io/ - and this declares it's text 
>> representation linking back to 
>> https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
>>  
>> which declares a Content-Type of "text/plain; version=0.0.4" - so 
>> defaulting to that, but following Joan's suggestion and switching to JSON 
>> for a supplied Accept:application/json in the standard way seems a like 
>> good choice to me.
>> 
>> Rich
>> 
>> 
>> 
>> From:   Jan Lehnardt <j...@apache.org>
>> To:     dev@couchdb.apache.org
>> Cc:     "Gesellchen, Tobias" <tobias.gesellc...@europace.de>
>> Date:   23/09/2020 16:42
>> Subject:        [EXTERNAL] Re: [DISCUSS] Prometheus endpoint in CouchDB 
>> 4.x
>> 
>> 
>> 
>> Hi all,
>> 
>> a few things to consider:
>> 
>> 1. The idea of unifying our “get runtime info about CouchDB” endpoints 
>> into one is solid, as it is always weird to make sure you know which info 
>> you get where. We see this specifically in support engagements, where it 
>> is always awkward to ask for the results of multiple endpoints.
>> 
>> 2. This directly leads to the question about what the endpoint should be 
>> called. I feel if it is a new endpoint, we should give it a new name. 
>> _info maybe, but feel free to bike shed away.
>> 
>> 3. Next the question about per-node and per-cluster info/metrics/activity 
>> on the endpoint. It might be convenient to be able to ask any one node 
>> about what is going on in the entire cluster, rather than any one node, 
>> but some stats only make sense in the context of a single node. Maybe the 
>> result includes everything separated by node somehow.
>> 
>> 4. Then the format: if this wasn’t about Prometheus and its custom format, 
>> we wouldn’t discuss any of this and just use JSON. Since we *do* want to 
>> target Prometheus with this, we have to talk about the format. Any of the 
>> above is useful for non-Prometheus consumers, so we should absolutely make 
>> this info available in JSON. And we can *also* send it in the Prometheus 
>> format. The “correct” HTTP-way of doing this would be to use the Accept 
>> header on the new endpoint, as Joan points out, but that’s often not an 
>> option, so we could fall back to a ?accept=prometheus option. This would 
>> also leave us open to add more formats in the future, as new standards 
>> arise.
>> 
>> 5. That leads us to whether we want to do this. Every five or so years, 
>> new standards for these types of systems arise, and sometimes it is worth 
>> incorporating them (like we finally do with the SystemD compatible log 
>> formatter) and sometimes it is not and folks write tools to convert from 
>> our HTTP/JSON standard to whatever they need (
>> https://github.com/gesellix/couchdb-prometheus-exporter 
>> )
>> 
>> 6. We could also just bundle this exporter (although it is written in Go, 
>> which we currently don’t have as a dependency.
>> 
>> * * *
>> 
>> Personally, I think the Prometheus format is widely enough used to warrant 
>> inclusion, as long as we do it tastefully. I think a new endpoint with an 
>> additional ?accept= or similar URL-level override for the format would be 
>> a pragmatic, if not entirely *neat* approach. If we can build this all in 
>> Erlang, the better, if we wanna shortcut dev time and bundle the Go 
>> project, I might be more hesitant. On the per-node-or-per-cluster 
>> question, I don’t know enough about the Prometheus format and whether it 
>> allows us to send the equivalent of {nodes: { “node1”: {…}, “node2”: {…}, 
>> “node3”: {…} }}, or whether it demands per-node output, in which case 
>> _active_tasks might get a bit awkward.
>> 
>> Best
>> Jan
>> 
>> — 
>> Professional Support for Apache CouchDB:
>> https://neighbourhood.ie/couchdb-support/ 
>> 
>> 
>> 24/7 Observation for your CouchDB Instances:
>> https://opservatory.app 
>> 
>> 
>>> On 22. Sep 2020, at 14:55, jiangph <jiangpeng...@hotmail.com> wrote:
>>> 
>>> Hey all,
>>> 
>>> We would like to add a Prometheus metrics endpoint for CouchDB and 
>> wanted to see if the community would be interested in us contributing this 
>> to CouchDB 4.x. 
>>> 
>>> Prometheus is a CNCF open-source project and the Prometheus metrics 
>> endpoint format is supported by many monitoring tools. Its data model is 
>> based around having a metric name which then contains a label name and a 
>> label value:
>>> 
>>> <metric name>{<label name>=<label value>, ...}
>>> 
>>> And it supports the Counter, Gauge, Histogram, and Summary metric types. 
>> 
>>> 
>>> The idea for the new Prometheus endpoint, /_metrics, would be that the 
>> endpoint is a consolidation of the _stats [1],  _system [2], and 
>> _active_tasks [3] endpoints. 
>>> 
>>> For _stats and _system, the conversion from JSON to Prometheus-based 
>> format seems to be straightforward. 
>>> 
>>> JSON format:
>>> {
>>> "value": {
>>> "min": 0,
>>> "max": 0,
>>> "arithmetic_mean": 0,
>>> "geometric_mean": 0,
>>> "harmonic_mean": 0,
>>> "median": 0,
>>> "variance": 0,
>>> "standard_deviation": 0,
>>> ...
>>> "percentile": [
>>> [
>>>  50,
>>>  0
>>> ],
>>> [
>>>  75,
>>>  0
>>> ],
>>> [
>>>  90,
>>>  0
>>> ],
>>> [
>>>  95,
>>>  0
>>> ],
>>> [
>>>  99,
>>>  0
>>> ],
>>> [
>>>  999,
>>>  0
>>> ]
>>> ],
>>> "histogram": [
>>> [
>>>  0,
>>>  0
>>> ]
>>> ],
>>> }
>>> 
>>> Prometheus-based format:
>>> 
>>> couchdb_stats{value="min"} 0
>>> couchdb_stats{value="max"} 0
>>> couchdb_stats{value="percentile50"} 0
>>> couchdb_stats{value="percentile75"} 0
>>> couchdb_stats{value="percentile95"} 0
>>> 
>>> For _active_tasks, the change will be a bit more complicated, and some 
>> fields will be added to labels and tags.
>>> 
>>> JSON format:
>>> 
>>> {
>>>  "checkpointed_source_seq": 68585,
>>>  "continuous": false,
>>>  "doc_id": null,
>>>  "doc_write_failures": 0,
>>>  "docs_read": 4524,
>>>  "docs_written": 4524,
>>>  "missing_revisions_found": 4524,
>>>  "pid": "<0.1538.5>",
>>>  "progress": 44,
>>>  "replication_id": "9bc1727d74d49d9e157e260bb8bbd1d5",
>>>  "revisions_checked": 4524,
>>>  "source": "mailbox",
>>>  "source_seq": 154419,
>>>  "started_on": 1376116644,
>>>      "target": "
>> http://mailsrv:5984/mailbox 
>> <
>> http://mailsrv:5984/mailbox 
>>> ",
>>>  "type": "replication",
>>>  "updated_on": 1376116651
>>> } 
>>> 
>>> Prometheus-based would look something like:
>>> 
>>> format:couchdb_active_task{type="replication", source="mailbox", 
>> target="
>> http://mailsrv:5984/mailbox 
>> <
>> http://mailsrv:5984/mailbox 
>>> ", docs_count = "docs_read"} 4524
>>> couchdb_active_task{type="replication", source="mailbox", target="
>> http://mailsrv:5984/mailbox 
>> <
>> http://mailsrv:5984/mailbox 
>>> ", docs_count = "docs_written"} 4524
>>> couchdb_active_task{type="replication", source="mailbox", target="
>> http://mailsrv:5984/mailbox 
>> <
>> http://mailsrv:5984/mailbox 
>>> ", docs_count = "missing_revisions_found"} 4524
>>> 
>>> 
>>> Best regards,
>>> Garren Smith
>>> Peng Hui Jiang
>>> 
>>> [1] 
>> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats
>>  
>> <
>> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-stats
>>  
>>> 
>>> [2] 
>> https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks 
>> <
>> https://docs.couchdb.org/en/latest/api/server/common.html#active-tasks 
>>> 
>>> [3] 
>> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system
>>  
>> <
>> https://docs.couchdb.org/en/latest/api/server/common.html#node-node-name-system
>>  
>>> 
>> 
>> 
>> 
>> 
>> 
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number 
>> 741598. 
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>> 
>

Re: [DISCUSS] Prometheus endpoint in CouchDB 4.x

Reply via email to