Re: [prometheus-developers] Re: More granular query performance metrics

Brian Brazil Wed, 24 Nov 2021 07:15:13 -0800

On Wed, 24 Nov 2021 at 14:52, Darshan Chaudhary <[email protected]>
wrote:


> During query evaluation, Prometheus tracks the current samples held in
> memory at evaluator.currentSamples
> <https://github.com/prometheus/prometheus/blob/f0003bc0ba77fca5ed4c1fe30337beea85dd95d1/promql/engine.go#L871>.
> This might be a good proxy for the "work" that Prometheus had to do to get
> the query result?
>

That's memory usage, not work done. There was
https://github.com/prometheus/prometheus/pull/6890 to track samples touched
which should be a good proxy (I use 10M/s as my rule of thumb), waiting to
make sure the performance hit is negligable.

Brian


>
> On Wednesday, 24 November 2021 at 18:25:46 UTC+5:30 [email protected]
> wrote:
>
>> Hello all,
>>
>> *TL;DR: *measuring `http_request_duration_seconds` on the query path is
>> a bad proxy for query latency as it does not account for data distribution
>> and number of samples/series touched by a query (both of which have
>> significant implications on the performance of a query)
>>
>> ---
>>
>> I'm exploring more granular performance metrics for prom queries
>> <https://github.com/thanos-io/thanos/issues/4895> downstream in Thanos
>> (inspired by this discussion from Ian Billet
>> <https://github.com/thanos-io/thanos/discussions/4674>) and wanted to
>> reach out to the Prometheus developer community for ideas on how people are
>> measuring and tracking query performance systematically.
>>
>> The aim is to create a new metric that captures these additional
>> dimensions with respect to the query to better understand/quantify query
>> performance SLI's with respect to number of samples/series touched
>> *before* a query is executed.
>>
>> The current solution I have arrived at is crude n-dimensional histogram,
>> where query_duration is observed/bucketed with labels representing some
>> scale (simplified to t-shirt sizes) of samples touched and series queried.
>> This would allow me to query for query_duration quantiles for some ranges
>> of sample/series sizes (e.g. 90% of queries for up to 1,000,000 samples and
>> up to 10 series complete in less than 2s)
>>
>> I would love to hear about other approaches members of the community have
>> taken for capturing this level of performance granularity in a metric (as
>> well as stir the pot wrt the thanos proposal).
>>
>> Thanks,
>>
>> Moad.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/a392a3c9-21b0-4174-9219-53cda79de0f1n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-developers/a392a3c9-21b0-4174-9219-53cda79de0f1n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Brian Brazil
www.robustperception.io

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLpB-Vr%2BbZEwAuKjSkJZFgu3NPj4nBRU5rhqw3HsHUjGwg%40mail.gmail.com.

Re: [prometheus-developers] Re: More granular query performance metrics

Reply via email to