On 25.11.21 10:35, Fabian Reinartz wrote: > > The point on TSDB becoming more structured is interesting – how firm are > these plans at this point? Any rough timelines?
I hope there will be a PoC for histograms in two or three months. It's hard to estimate how long it will take after that to get to a mature implementation that can be part of a proper Prometheus release. But that's only histograms, i.e. changing the hardcoded "every sample is a timestamped float" to a hardcoded "every sample is either a timestamped float or a timestamped histogram". My hope is that this change will teach us how we can go one step further in the future and generalize handling of structured sample data. So yeah, it's at least three steps away, and timelines are hard to predict. > My first hunch would've been to explore integrating directly at the > scraping layer to directly stream OpenMetrics (or a proto-equivalent) from > there, backed by a separate, per-write-target WAL. > This wouldn't constrain it by the currently supported storage data model > and generally decouple the two aspects, which also seems more aligned with > recent developments like the agent mode. > Any thoughts on that general direction? Yes, this would be more in line with an "agent" or "collector" model. However, it would kick in earlier in the ingestion pipeline than the current Prometheus agent (or Grafana agent, FWIW) and therefore would need to reimplement certain parts (while the Prometheus agent, broadly simplified, just takes things away, but doesn't really change or add anything fundamental): Obviously, it needed a completely new WAL and the ingestion into it. It even affects the parser because the Prometheus 2.x parser shortcuts directly into the flat internal TSDB data model. Ironically, the idea is similar to the very early attempt of remote write (pre 1.x), which was closer to the scraping layer. Also, prior to Prometheus 2.x, parsing was separate from flattening the data model, with the intention of enabling an easy migration to a TSDB supporting a structured data model. Back then, one reason to not go further down that path was the requirement of also remote-write the result of recording rules. Recording rules act on data in the TSDB and write data to the TSDB, so they are closely linked to the data model of the TSDB. In the spirit of "one day we will just enable the TSDB to handle structured data", I would have preferred to go the extra mile and convert the output of recording rules back into the data model of the exposition format (similar to how we did it (imperfectly) for federation), but the general consensus was to move remote-write away from the scraping layer and closer to the TSDB layer (which might have been a key to the success of remote-write). That same reasoning is still relevant today, and this might touch the concerns Julien has expressed: If users use Prometheus (or a Prometheus-like agent) just to collect metrics into the metrics solution of a vendor, things work out just fine. But if recording (or alerting) rules come into the game, things get a bit awkward. Even if we funneled the result of recording rules back into the future scrape-layer-centric remote-write somehow, it will still feel a bit like a misfit, and users might think it's better to not do rule evaluation in Prometheus anymore but move this kind of processing into the scope of the metrics vendor (which could be one that is Prometheus-compatible, which would at least keep the rules portable, but in many cases, it would be a very different system). From a pessimistic perspective, one might say this whole approach reduces Prometheus to service discovery and scraping. Everything from the parser on will be new or different. As a Prometheus developer, I would prefer that users utilize a much larger part of what Prometheus offers today. I also see (and always have seen) the need for structured data (and metadata, in case that isn't implied). That's why I want to evolve the internal Prometheus data model including the one used in the TSDB, and to evolve the remote write/read protocols with it. That's an idealistic perspective, of course, and similar to the remote-write protocol as we know it, a more pragmatic approach might be necessary to yield working results in time. But perhaps this time, designs could take into account the vision above so that later, all the pieces of the puzzle can fall into place rather than moving the vision even farther out of reach. -- Björn Rabenstein [PGP-ID] 0x851C3DA17D748D03 [email] [email protected] -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/20211201122425.GS3668%40jahnn.

