Re: [prometheus-developers] Evolving remote APIs

Bjoern Rabenstein Wed, 01 Dec 2021 04:24:30 -0800

On 25.11.21 10:35, Fabian Reinartz wrote:
> 
> The point on TSDB becoming more structured is interesting – how firm are 
> these plans at this point? Any rough timelines?


I hope there will be a PoC for histograms in two or three months. It's
hard to estimate how long it will take after that to get to a mature
implementation that can be part of a proper Prometheus release.

But that's only histograms, i.e. changing the hardcoded "every sample
is a timestamped float" to a hardcoded "every sample is either a
timestamped float or a timestamped histogram". My hope is that this
change will teach us how we can go one step further in the future and
generalize handling of structured sample data.

So yeah, it's at least three steps away, and timelines are hard to
predict.

> My first hunch would've been to explore integrating directly at the 
> scraping layer to directly stream OpenMetrics (or a proto-equivalent) from 
> there, backed by a separate, per-write-target WAL.
> This wouldn't constrain it by the currently supported storage data model 
> and generally decouple the two aspects, which also seems more aligned with 
> recent developments like the agent mode.
> Any thoughts on that general direction?

Yes, this would be more in line with an "agent" or "collector"
model. However, it would kick in earlier in the ingestion pipeline
than the current Prometheus agent (or Grafana agent, FWIW) and
therefore would need to reimplement certain parts (while the
Prometheus agent, broadly simplified, just takes things away, but
doesn't really change or add anything fundamental): Obviously, it
needed a completely new WAL and the ingestion into it. It even affects
the parser because the Prometheus 2.x parser shortcuts directly into
the flat internal TSDB data model.

Ironically, the idea is similar to the very early attempt of remote
write (pre 1.x), which was closer to the scraping layer. Also, prior
to Prometheus 2.x, parsing was separate from flattening the data
model, with the intention of enabling an easy migration to a TSDB
supporting a structured data model.

Back then, one reason to not go further down that path was the
requirement of also remote-write the result of recording
rules. Recording rules act on data in the TSDB and write data to the
TSDB, so they are closely linked to the data model of the
TSDB. In the spirit of "one day we will just enable the TSDB to handle
structured data", I would have preferred to go the extra mile and
convert the output of recording rules back into the data model of the
exposition format (similar to how we did it (imperfectly) for
federation), but the general consensus was to move remote-write away
from the scraping layer and closer to the TSDB layer (which might have
been a key to the success of remote-write).

That same reasoning is still relevant today, and this might touch the
concerns Julien has expressed: If users use Prometheus (or a
Prometheus-like agent) just to collect metrics into the metrics
solution of a vendor, things work out just fine. But if recording (or
alerting) rules come into the game, things get a bit awkward. Even if
we funneled the result of recording rules back into the future
scrape-layer-centric remote-write somehow, it will still feel a bit
like a misfit, and users might think it's better to not do rule
evaluation in Prometheus anymore but move this kind of processing into
the scope of the metrics vendor (which could be one that is
Prometheus-compatible, which would at least keep the rules portable,
but in many cases, it would be a very different system). From a
pessimistic perspective, one might say this whole approach reduces
Prometheus to service discovery and scraping. Everything from the
parser on will be new or different.

As a Prometheus developer, I would prefer that users utilize a much
larger part of what Prometheus offers today. I also see (and always
have seen) the need for structured data (and metadata, in case that
isn't implied). That's why I want to evolve the internal Prometheus
data model including the one used in the TSDB, and to evolve the
remote write/read protocols with it.

That's an idealistic perspective, of course, and similar to the
remote-write protocol as we know it, a more pragmatic approach might
be necessary to yield working results in time. But perhaps this time,
designs could take into account the vision above so that later, all
the pieces of the puzzle can fall into place rather than moving the
vision even farther out of reach.

-- 
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] [email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/20211201122425.GS3668%40jahnn.

Re: [prometheus-developers] Evolving remote APIs

Reply via email to