On Sun, 21 Jun 2020 at 18:44, Bashar Al Rawi <[email protected]>
wrote:

> Thanks Brian for the thorough response. There are a few points that I
> would like to make.
> >> "This proposal also loses the benefits of the current format, namely
> that you can no longer drop overly-granular buckets nor can you calculate
> things like Apdex both of which are important properties."
> You can still calculate calculate the Apdex with the new format. In fact,
> the formula would be exactly the same as in wikipedia
> <https://en.wikipedia.org/wiki/Apdex> since we're not using cumulative
> counts. The main difference is that you will need to sum up all the ranges
> in both the tolerating and satisfied ranges (which I agree with you is a
> bit annoying). I just wanted to call out that you can calculate it.
>


> I'm not sure I understood the part about dropping overly-granular buckets.
>

The advantage of cumulative buckets is that you can drop arbitrary buckets
at scrape time without breaking the whole histogram. This is an important
tactical tool when dealing with overly granular buckets.


>
> - I agree with you that this won't solve storing 100 buckets per metric.
> However, the main thing this brings is that you only use the buckets you
> use vs every single bucket you define. This is a big advantage since it
> reduces the number of time-series significantly.
>
Suppose Θ(n) denotes the number of time-series defined where n is the
> number of histogram metrics. In the current format, O(n) = Ω(n) = Θ(n) =
> N*n, where N is the number of buckets defined per metric. In the new
> format, I would expect Θ(n) << N*n because it's a function of the buckets
> used as opposed to buckets defined. This is an important distinction.
>

This is presuming that there are buckets that are defined which are never
used, however that's rarely the case and something that is easy to tweak in
application by making a more appropriate bucket choice. Keep in mind that
due to buckets being counters it only takes a single event in the lifetime
of a process for a bucket to be present. So your proposal doesn't help with
the actual problem of how we support more used buckets, as the number of
buckets used will basically be the same before and after. This is at best a
microoptimisation, and would also cause issues with series not always being
exposed. We don't need a 10% improvement in histogram handling that applies
only near process start, we need a 10x improvement that works at all times.


>
> - I'm not saying we should drop the existing format. We can support both.
> There is nothing really breaking about this besides adding a new reserved
> label. Depending on the use-case, someone could choose to create buckets
> with "le" or with "range". histogram_quantile is updated to handle both
> cases as I did in the commit.
>

That's a breaking change, as old Prometheus servers and other consumers of
the format cannot deal with this new bespoke format. I'm also
strongly against there being two ways to represent a histogram, that's
going to cause needless confusion.

Getting into format changes is a distraction from the hard TSDB-related
problems of histograms, and the problem should be addressed at the TSDB
first rather than making disruptive changes that don't really help with the
problem at hand.

Brian


>
> I would also love to learn more about any ideas that came up to optimize
> this in TSDB.
> On Sunday, June 21, 2020 at 2:52:21 AM UTC-7 [email protected]
> wrote:
>
>> On Sun, 21 Jun 2020 at 01:44, Bashar Al Rawi <[email protected]> wrote:
>>
>>> Hi all,
>>>
>>> I would like to propose a new format for storing bucket counts that can
>>> provide substantial improvements for sparse/bi-modal metrics along with
>>> changes to histogram_quantile that allow the new format. Both the change
>>> and format aren't complex and can be done with backward compatibility. The
>>> main breaking change would be adding a new reserved label.
>>>
>>> I put together my thoughts in this Google doc
>>> <https://docs.google.com/document/d/1aYWDLs15j-rlH2nLyoNimj7SP2RGGw_fe2ST9OqZeDI/edit?usp=sharing>
>>>  that
>>> is open for comments and a rough implementation in this commit
>>> <https://github.com/basharal/prometheus/commit/fba202e04f32724a2dcd03d78da35a0a6eda81d3>
>>> .
>>>
>>
>> Thanks for your doc. The first challenge here is that this involves a
>> format change. The Prometheus text format is basically fixed at this point,
>> OpenMetrics will be replacing it but sticks with the same format for
>> histograms as OpenMetrics aims to leverage the existing installed base. The
>> duplication in the current format isn't really a problem in practice, as
>> compression handles this. This proposal also loses the benefits of the
>> current format, namely that you can no longer drop overly-granular buckets
>> nor can you calculate things like Apdex both of which are important
>> properties. So this would break important use cases, and cause busy work
>> across the ecosystem.
>>
>> The really big issue here is that understanding is that even with
>> something like this you still need ~300 buckets to not have to worry about
>> bucket choices given the typical range of event sizes. This is well above
>> the ~10 recommended currently when considering other labels, so something
>> would need to be done to make the TSDB handling of histograms ~30x more
>> efficient before this would make sense.
>>
>>
>> The way to approach this instead is to leave the format as-is, and look
>> at how to optimise this inside the TSDB (and later PromQL) so that more
>> buckets are handled transparently and efficiently for histogram_quantile -
>> but otherwise look the same to PromQL as today.
>>
>> In short changing the text format - even with it being a major breaking
>> change that'd break key use cases - would be the easy bit, and doesn't
>> actually help. The hard bits are all in the TSDB, and you should try to
>> solve those first as without TSDB improvements there's no point.
>>
>> Brian
>>
>>
>>>
>>> Please, take a look and let me know what your thoughts are.
>>>
>>> Bashar
>>>
>>> --
>>>
>> You received this message because you are subscribed to the Google Groups
>>> "Prometheus Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-developers/b3cd52af-3e60-4385-8a0d-b914559ab0b0n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-developers/b3cd52af-3e60-4385-8a0d-b914559ab0b0n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>> Brian Brazil
>> www.robustperception.io
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/58cd8414-31c4-4fa8-9fab-905820e57ef0n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-developers/58cd8414-31c4-4fa8-9fab-905820e57ef0n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Brian Brazil
www.robustperception.io

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLrxpQCbeBTjtOEmBrWfnNw9mecHSaXHYo%3DZrvAw-ULbdA%40mail.gmail.com.

Reply via email to