[prometheus-developers] Prometheus Performance Issue / Performance tuning

Midhun K Wed, 28 Oct 2020 06:06:54 -0700

Hello Guys,

*What is the Problem?*
I'm facing slow Grafana dashboard performance, I'm using Prometheus as my 
datastore,
just need to debug/understand the bottleneck/slowness.

*What I've tried to improve performance? *
1. Tried Trickster as a caching/accelerator layer between Prometheus and
Grafana.
2. Increase some query parameters limits.

--query.max-concurrency=20
Maximum number of queries executed
concurrently.
--query.max-samples=50000000
Maximum number of samples a single query
can load into memory.
These help to reduce connection timeout issues but not help for
slow performance
3. Check System resources usage - Its good enough to handle the query.

*What I need to know ?*
1. Want understand more about below timing stats which can fetch from
prometheus query logs
(evalTotalTime,execQueueTime,execTotalTime",innerEvalTime,queryPreparationTime",resultSortTime

)

"stats": {
"timings": {
"evalTotalTime": 0.000447452,
"execQueueTime": 7.599e-06,
"execTotalTime": 0.000461232,
"innerEvalTime": 0.000427033,
"queryPreparationTime": 1.4177e-05,
"resultSortTime": 6.48e-07
}
2. We're using Prometheus widely but unable to find a useful resource for
performance tuning, so can you guys please share some tunable options/ideas
to improve Prometheus query performance, guide me, to do anything better to
narrow down the exact area which contributing the slowness.

*Stack Details *
OS: Centos 7
Version: Prometheus 2.20
Deployment: Docker compose stack (Prometheus, Grafana, Trickster)

*+ Adding some additional points.*

If prometheus_engine_queries is greater than
prometheus_engine_queries_concurrent_max, it means that some queries are
queued. The queue time is part of the two-minute default timeout.

We have Analysed the max query rate from our dashboard it's between 30-40 &
our default value was 20, this will cause some timeout and slowness due to
queueing the request so now its increased it to 60. (this will cause some
more resource utilisation but it's ok as per our system specification )

--
You received this message because you are subscribed to the Google Groups
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-developers/9ca46a9d-b788-4e0f-a371-5ab493c76441n%40googlegroups.com.

[prometheus-developers] Prometheus Performance Issue / Performance tuning

Reply via email to