Re: [prometheus-users] Re: Single Prometheus for Large Cluster

Ben Kochie Tue, 07 Sep 2021 07:39:46 -0700

I don't know if this is still the case, but there are some label
configurations in the helm cart that lead to excessive labels on
Kubernetes. This can lead to index/memory bloat.


Most of the memory bloat I've seen in our production clusters lately has
more to do with auto-scaling pod churn. If you're using a heavy
auto-scaling, and lots of single-core pods, you'll end up bloating the
metrics a lot.

On Tue, Sep 7, 2021 at 3:51 PM Brian Candler <[email protected]> wrote:

> Such a short retention is unlikely to help at all; WAL blocks have a 2
> hour duration I think.
>
> Across some systems I have here, the average number of metrics per node is
> 2366: this is the (expensive) query which gives it:
> avg(count by (instance) ({job="node"}))
>
> So with 1300 nodes that would be about 3 million metrics.  Quite a lot,
> but not extraordinarily so.  I've seen recommendations to start splitting
> Prometheus servers when you reach 2m.  There is a RAM calculation tool here:
>
> https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion
> With 3m series and 1m unique label pairs, it still only comes out to 8GB.
> If you're needing much more than that, then you need to read and understand
> the stats from the TSDB status page.  You can post them here if you want
> help interpreting them.  And you need to understand what queries (if any)
> are taking place against your database, since those use RAM too.
>
> Looking at "Top 10 series count by metric names" in the Prometheus Status
> page, in my case it's node_cpu_seconds_total{}.  For me it's
> node_cpu_seconds_total{}.  If you don't require the usage of each core
> individually, then you might be inclined to drop it.
>
> You could also see if victoriametrics + vmagent works better for your use
> case.
>
> On Tuesday, 7 September 2021 at 13:57:48 UTC+1 [email protected] wrote:
>
>> Thank you Brian for the reply. Yes I mean host (nodes).
>> What we have done for the mean time is we have set the retentionTime of
>> prometheus to 5minutes (which I am not comfortable) but was advised by
>> seniors just for us to continue.
>> Thanks for the information above, i'll check it out and try on our
>> cluster environment.
>>
>>
>>
>>
>>
>>
>> On Tue, Sep 7, 2021 at 4:50 PM Brian Candler <[email protected]> wrote:
>>
>>> It's not clear what you mean by "No. of Nodes" - whether you mean hosts
>>> (e.g. which you're scraping using node_exporter), or pods, or something
>>> else.  But what matters is the total number of metrics, and the amount of
>>> metric churn,  i.e. the rate at which new timeseries are being created
>>> dynamically; and also how much querying is going on.
>>>
>>> If you go to Prometheus web interface, Status > TSDB Status, you'll get
>>> some statistics which may help you.  Consider:
>>>
>>> - collecting fewer metrics (by changing what you scrape, and/or using
>>> metric_relabel_configs to drop some timeseries which are not of interest)
>>>
>>> - see if it's possible to reduce timeseries churn.  For example, if you
>>> have one application which is generating large numbers of short-lived pods
>>> then you may wish to reduce or suppress the metrics collected for those
>>> pods.
>>>
>>> - have a look at the PromQL queries being executed, and whether any of
>>> these are using excessing amounts of RAM.  The query log
>>> <https://prometheus.io/docs/guides/query-log/> may help.  You can also
>>> apply limits to how much memory is used by individual queries using
>>>       --query.max-concurrency=20  # default
>>>       --query.max-samples=50000000  # default
>>> (although that may cause the offending queries to fail)
>>>
>>> There are also blog posts out there which you can turn up with a search,
>>> e.g.
>>> https://source.coveo.com/2021/03/03/prometheus-memory/
>>>
>>> On Tuesday, 7 September 2021 at 07:34:51 UTC+1 [email protected] wrote:
>>>
>>>> Hi everyone, I am new here.
>>>>
>>>> I would like to seek some advice on the design approach we should take.
>>>> With the given problem below, in terms of cost, how can we set up
>>>> Prometheus with a large cluster.
>>>>
>>>> *Variables:*
>>>> *Installation: *Kube-stack-prometheus helm chart.
>>>> *Autoscale*: yes
>>>> *No. of Nodes*: 1000 up to 1300
>>>> *Mesh*: Istio
>>>> *Memory Usage:* 50GB (Still gets OOM)
>>>> *Installed: *1 Prometheus, 1 Kiali, 1 Grafana and 1 Jaeger
>>>>
>>>> *Issue:*
>>>> 1. We cannot expand a larger node for Prometheus as 60GB memory is
>>>> already expensive.  (cost not approved by management)
>>>> 2. Removing unnecessary metrics is not yet advised because we do not
>>>> know which metrics of istio, jaeger and kiali are needed.
>>>>
>>>> *Tried solution:*
>>>> We have federated the single instance of prometheus with Thanos
>>>> Receivers, however, the issue is still there because kiali queries its data
>>>> directly from prometheus which eventually gets OOM.
>>>>
>>>> *Question:*
>>>> We are thinking of firing up multiple prometheus for each namespace and
>>>> adding thanos-sidecar with the same scrape config since thanos will
>>>> deduplicate all duplicated metrics. This approach would solve the issue in
>>>> Grafana queries but not in Kiali.
>>>>
>>>> How can we set up a multiple prometheus (low cost) but single instance
>>>> prometheus for kiali (whole cluster)?
>>>>
>>>> Appreciate any help. Thank you.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/24a15533-094e-4a4c-9644-5d4375b6aaa2n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-users/24a15533-094e-4a4c-9644-5d4375b6aaa2n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/bde269c1-119e-4d1e-a899-9f27332b0ff6n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/bde269c1-119e-4d1e-a899-9f27332b0ff6n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmrsfgRFDsduqz0ue3o%3DxKVJPn9K-4GvC%3DjhT%3DoqJySMpQ%40mail.gmail.com.

Re: [prometheus-users] Re: Single Prometheus for Large Cluster

Reply via email to