Hi Bashar,

Thanks for your thoughts and ideas.

I more or less agree with all of them, namely:

- We need sparse histograms.

- Cumulative buckets, despite some tactical advantages, are
  problematic for sparse histograms, while on the other hand a bit of
  math can always emulate dropping buckets or allow Apdex calculation.

- We require a new type of histogram in the exposition format that is
  incompatible with the existing format.

- A repetitive representation of buckets in the exposition format is
  problematic, and becomes more problematic with more buckets. And no,
  compression isn't solving that magically.

I really want histograms to be cheap enogh so that they can be
partioned at will (by status code, path, ...) while still maintaining
a high resolution.

Your approach goes several steps towards this goal.

BUT (and here comes the big "but") it will not go far enough. What we
need, even with sparse histograms, is a histogram implementation that
is efficient enough to suppert hundreds of buckets in a single
histogram at a cost that is comparable or even lower than we have to
pay now for our existing ~10 bucket histograms. I expect that to
require quite invasive changes not only to the exposition format but
also to the way we store histograms in the TSDB and ultimately how we
represent and process them in PromQL.

Now you could say, why not iterate and slowly approach the goal. That
would be totally fine with an experimental software, and I can only
encourage you to play with your approach in an experimental fork. But
we cannot really have those incremental changes in the mainline
Prometheus releases as people will use them in production and then
require backwads compatible support. We cannot really have dozens of
mutually incompatibly ways of dealing with histograms in the released
Prometheus components.

That's why I've been experimenting for a while. I'm currently writing
up a design doc suggesting a plan for the changes we need throughout
the stack. It will not be a precise and perfect solution, but it will
sketch out the direction along which we can then work together towards
a solution. It will take a while before things have stabilized enough
to have them in the regular Prometheus releases. And that's a shame
because in the meantime, people are left with the existing solution
for their production uses – or they can go down the path of adopting
one of the experimental half-baked solution (of which there are more
than just yours) to solve their most pressing problems, with the price
of incompatibility with the future "proper" solution.

I'm currently very focused on getting that design doc done because it
will create the stage for further discussions and the foundation of an
informed decision which way to go.

Stay tuned, I'll publish it here on this list, hopefully very soon.
-- 
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] [email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/20200622210954.GT3365%40jahnn.

Reply via email to