Hi, I make a short proposal of structured remote write at https://github.com/prometheus/prometheus/issues/10539. There must be lots of use case/concerns of not moving to structured protocol, but after weighing the pros and cons, I still think have such a proposal with discussion will help Prometheus 3.x set TSDB data model, scaling, etc.
That proposal is still in draft status, please take a look and left any comment in your mind. On Wednesday, December 1, 2021 at 7:24:28 AM UTC-5 [email protected] wrote: > On 25.11.21 10:35, Fabian Reinartz wrote: > > > > The point on TSDB becoming more structured is interesting – how firm are > > these plans at this point? Any rough timelines? > > I hope there will be a PoC for histograms in two or three months. It's > hard to estimate how long it will take after that to get to a mature > implementation that can be part of a proper Prometheus release. > > But that's only histograms, i.e. changing the hardcoded "every sample > is a timestamped float" to a hardcoded "every sample is either a > timestamped float or a timestamped histogram". My hope is that this > change will teach us how we can go one step further in the future and > generalize handling of structured sample data. > > So yeah, it's at least three steps away, and timelines are hard to > predict. > > > My first hunch would've been to explore integrating directly at the > > scraping layer to directly stream OpenMetrics (or a proto-equivalent) > from > > there, backed by a separate, per-write-target WAL. > > This wouldn't constrain it by the currently supported storage data model > > and generally decouple the two aspects, which also seems more aligned > with > > recent developments like the agent mode. > > Any thoughts on that general direction? > > Yes, this would be more in line with an "agent" or "collector" > model. However, it would kick in earlier in the ingestion pipeline > than the current Prometheus agent (or Grafana agent, FWIW) and > therefore would need to reimplement certain parts (while the > Prometheus agent, broadly simplified, just takes things away, but > doesn't really change or add anything fundamental): Obviously, it > needed a completely new WAL and the ingestion into it. It even affects > the parser because the Prometheus 2.x parser shortcuts directly into > the flat internal TSDB data model. > > Ironically, the idea is similar to the very early attempt of remote > write (pre 1.x), which was closer to the scraping layer. Also, prior > to Prometheus 2.x, parsing was separate from flattening the data > model, with the intention of enabling an easy migration to a TSDB > supporting a structured data model. > > Back then, one reason to not go further down that path was the > requirement of also remote-write the result of recording > rules. Recording rules act on data in the TSDB and write data to the > TSDB, so they are closely linked to the data model of the > TSDB. In the spirit of "one day we will just enable the TSDB to handle > structured data", I would have preferred to go the extra mile and > convert the output of recording rules back into the data model of the > exposition format (similar to how we did it (imperfectly) for > federation), but the general consensus was to move remote-write away > from the scraping layer and closer to the TSDB layer (which might have > been a key to the success of remote-write). > > That same reasoning is still relevant today, and this might touch the > concerns Julien has expressed: If users use Prometheus (or a > Prometheus-like agent) just to collect metrics into the metrics > solution of a vendor, things work out just fine. But if recording (or > alerting) rules come into the game, things get a bit awkward. Even if > we funneled the result of recording rules back into the future > scrape-layer-centric remote-write somehow, it will still feel a bit > like a misfit, and users might think it's better to not do rule > evaluation in Prometheus anymore but move this kind of processing into > the scope of the metrics vendor (which could be one that is > Prometheus-compatible, which would at least keep the rules portable, > but in many cases, it would be a very different system). From a > pessimistic perspective, one might say this whole approach reduces > Prometheus to service discovery and scraping. Everything from the > parser on will be new or different. > > As a Prometheus developer, I would prefer that users utilize a much > larger part of what Prometheus offers today. I also see (and always > have seen) the need for structured data (and metadata, in case that > isn't implied). That's why I want to evolve the internal Prometheus > data model including the one used in the TSDB, and to evolve the > remote write/read protocols with it. > > That's an idealistic perspective, of course, and similar to the > remote-write protocol as we know it, a more pragmatic approach might > be necessary to yield working results in time. But perhaps this time, > designs could take into account the vision above so that later, all > the pieces of the puzzle can fall into place rather than moving the > vision even farther out of reach. > > -- > Björn Rabenstein > [PGP-ID] 0x851C3DA17D748D03 > [email] [email protected] > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/5a681a2c-3bce-4264-a349-c9ad6ac12ee8n%40googlegroups.com.

