Ok - I have a proposal which could be broken up into two pieces, first
delivering TYPE per datapoint, the second consistently and reliably HELP
and UNIT once per unique metric name:
https://docs.google.com/document/d/1LY8Im8UyIBn8e3LJ2jB-MoajXkfAqW2eKzY735aYxqo/edit#heading=h.bik9uwphqy3g

Would love to get some feedback on it. Thanks for the consideration. Is
there anyone in particular I should reach out to ask for feedback from
directly?

Best,
Rob


On Tue, Jul 21, 2020 at 5:55 PM Rob Skillington <[email protected]> wrote:

> Also want to point out that with just TYPE you can do things such as know
> it's a histogram type and then suggest using "sum(rate(...)) by (le)" with
> a one click button in a UI which again is significantly harder without that
> information.
>
> The reason it becomes important though is some systems (i.e. StackDriver)
> require this schema/metric information the first time you record a sample.
> So you really want the very basics of it the first time you receive that
> sample (i.e. at least TYPE):
>
> Defines a metric type and its schema. Once a metric descriptor is created,
>> deleting or altering it stops data collection and makes the metric type's
>> existing data unusable.
>> The following are specific rules for service defined Monitoring metric
>> descriptors:
>> type, metricKind, valueType and description fields are all required. The
>> unit field must be specified if the valueType is any of DOUBLE, INT64,
>> DISTRIBUTION.
>> Maximum of default 500 metric descriptors per service is allowed.
>> Maximum of default 10 labels per metric descriptor is allowed.
>
>
> https://cloud.google.com/monitoring/api/ref_v3/rest/v3/projects.metricDescriptors
>
> Just an example, but other systems and definitely systems that want to do
> processing of metrics on the way in would prefer at very least things like
> TYPE and maybe ideally UNIT too are specified.
>
>
> On Tue, Jul 21, 2020 at 5:49 PM Rob Skillington <[email protected]>
> wrote:
>
>> Hey Chris,
>>
>> Apologies on the delay to your response.
>>
>> Yes I think that even just TYPE would be a great first step. I am working
>> on a very small one pager that outlines perhaps how we get from here to
>> that future you talk about.
>>
>> In terms of downstream processing, just having the TYPE on every single
>> sample would be a huge step forward as it enables the ability to do
>> stateless processing of the metric (i.e. downsampling and working out
>> whether counter resets need to be detected during downsampling of this
>> single individual sample).
>>
>> Also you can imagine this enables the ability to suggest certain
>> functions that can be applied, i.e. auto-suggest rate(...) should be
>> applied without needing to analyze or use best effort heuristics of the
>> actual values of a time series.
>>
>> Completely agreed that solving this for UNIT and HELP is more difficult
>> and that information would likely be much nicer to be sent/stored per
>> metric name rather than per time-series sample.
>>
>> I'll send out the Google doc for some comments shortly.
>>
>> Transactional approach is interesting, it could be difficult given that
>> this information can flap (i.e. start with some value for HELP/UNIT but a
>> different target of the same application has a different value) and hence
>> that means ordering is important and dealing with transactional order could
>> be a hard problem. I agree that making this deterministic if possible would
>> be great. Maybe it could be something like a token that is sent alongside
>> the first remote write payload, and if that continuation token that the
>> receiver sees means it missed some part of the stream then it can go and do
>> a full sync and from there on in receive updates/additions in a
>> transactional way from the stream over remote write. Just a random thought
>> though and requires more exploration / different solutions being listed to
>> weigh up pros/cons/complexity/etc.
>>
>> Best,
>> Rob
>>
>>
>>
>> On Thu, Jul 16, 2020 at 4:39 PM Chris Marchbanks <[email protected]>
>> wrote:
>>
>>> Hi Rob,
>>>
>>> I would also like metadata to become stateless, and view 6815
>>> <https://github.com/prometheus/prometheus/pull/6815> only as a first
>>> step, and the start of an output format. Currently, there is a work in
>>> progress design doc, and another topic for an upcoming dev summit, for
>>> allowing use cases where metadata needs to be in the same request as the
>>> samples.
>>>
>>> Generally, I (and some others I have talked to) don't want to send all
>>> the metadata with every sample as that is very repetitive, specifically for
>>> histograms and metrics with many series. Instead, I would like remote write
>>> requests to become transaction based, at which point all the metadata from
>>> that scrape/transaction can be added to the metadata field introduced
>>> to the proto in 6815
>>> <https://github.com/prometheus/prometheus/pull/6815> and then each
>>> sample can be linked to a metadata entry without as much duplication. That
>>> is very broad strokes, and I am sure it will be refined or changed
>>> completely with more usage.
>>>
>>> That said, TYPE and UNIT are much smaller than metric name and help
>>> text, and I would support adding those to a linked metadata entry before
>>> remote write becomes transactional. Would that satisfy your use cases?
>>>
>>> Chris
>>>
>>> On Thu, Jul 16, 2020 at 1:43 PM Rob Skillington <[email protected]>
>>> wrote:
>>>
>>>> Typo: "community request" should be: "community contribution that
>>>> duplicates some of PR 6815"
>>>>
>>>> On Thu, Jul 16, 2020 at 3:27 PM Rob Skillington <[email protected]>
>>>> wrote:
>>>>
>>>>> Firstly: Thanks a lot for sharing the dev summit notes, they are
>>>>> greatly appreciated. Also thank you for a great PromCon!
>>>>>
>>>>> In regards to prometheus remote write metadata propagation consensus,
>>>>> is there any plans/projects/collaborations that can be done to perhaps 
>>>>> plan
>>>>> work on a protocol that might help others in the ecosystem offer the same
>>>>> benefits to Prometheus ecosystem projects that operate on a per write
>>>>> request basis (i.e. stateless processing of a write request)?
>>>>>
>>>>> I understand https://github.com/prometheus/prometheus/pull/6815 unblocks
>>>>> feature development on top of Prometheus for users with specific
>>>>> architectures, however it is a non-starter for a lot of other projects,
>>>>> especially for third party exporters to systems that are unowned by end
>>>>> users (i.e. writing a StackDriver remote write endpoint that targeted
>>>>> StackDriver, the community is unable to change the implementation of
>>>>> StackDriver itself to cache/statefully make metrics metadata available at
>>>>> ingestion time to StackDriver).
>>>>>
>>>>> Obviously I have a vested interest since as a remote write target, M3
>>>>> has several stateless components before TSDB ingestion and flowing the
>>>>> entire metadata to a distributed set of DB nodes that own a different set
>>>>> of the metrics space from each other node this has implications on M3
>>>>> itself of course too (i.e. it is non-trivial to map metric name -> DB node
>>>>> without some messy stateful cache sitting somewhere in the architecture
>>>>> which adds operational burdens to end users).
>>>>>
>>>>> I suppose what I'm asking is, are maintainers open to a community
>>>>> request that duplicates some of
>>>>> https://github.com/prometheus/prometheus/pull/6815 but sends just
>>>>> metric TYPE and UNIT per datapoint (which would need to be captured by the
>>>>> WAL if feature is enabled) to a backend so it can statefully be processed
>>>>> correctly without needing a sync of a global set of metadata to a backend?
>>>>>
>>>>> And if not, what are the plans here and how can we collaborate to make
>>>>> this data useful to other consumers in the Prometheus ecosystem.
>>>>>
>>>>> Best intentions,
>>>>> Rob
>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Prometheus Developers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/prometheus-developers/CABakzZbvZeyKLXfK08aiXgGcZso%3D8A0H1JBT9jwBzf6rCiUmVw%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/prometheus-developers/CABakzZbvZeyKLXfK08aiXgGcZso%3D8A0H1JBT9jwBzf6rCiUmVw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CABakzZY%3DShJyw5am7n56OGBgJ3dTNrEcCiiSFmiBc1QUYZP2Kw%40mail.gmail.com.

Reply via email to