Re: [prometheus-users] Prometheus getting slow on about 400 node_exporter instances

Nur Kholis Majid Sat, 29 Feb 2020 16:40:45 -0800

Hi Julien,

On Sunday, March 1, 2020 at 6:44:34 AM UTC+7, Julien Pivotto wrote:
>
> On 29 Feb 15:38, Nur Kholis Majid wrote: 
> > Hi, 
> > 
> > I've test prometheus to monitoring node_exporter on 400 instances. With 
> > default configuration, in just two months tsdb size reach +- 450GB and 
> > memory size +- 135GB. Query become slow and unuseable. 
> > 
> > [image: photo_2020-03-01_06-33-51.jpg] 
> > 
> > [image: photo_2020-03-01_06-34-00.jpg] 
>
>
> Can we know what you mean by default configuration? Is it default or 
> documented one? What are your startup parameters? 
>
> I mean I just add minimum configuration in prometheus.yml:
$ cat prometheus.yml
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. 
Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default 
is every 1 minute.
  # scrape_timeout is set to the global default (10s).


# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 
'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries 
scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'node'
    static_configs:
    - targets: ['10.10.10.1:9100', '10.10.10.2:9100', etc until 400 nodes]

In node_exporter side, no additional config made. 
 

> How many series do you have? 
> max_over_time(prometheus_tsdb_head_series[1d]) 
>
> 771651
 

> Do you have lots of different disks/devices per machines ? lots of 
> network interfaces? 
>
Yes. Each node consist of 2 NIC in bonding mode and 12 disks.
 

>
> I recommend you read 
>
> https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion
>  
> to better understand this. 
>
> > 
> > 
> > Question: 
> > 1. How many maximum node_exporter instances can handle by prometheus 
> with 
> > acceptable query duration? 
> > 2. Is there any special prometheus configuration for huge amount of 
> > instances? 
> > 
> > Thank you 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Prometheus Users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to [email protected] <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/7da6b213-02d0-4beb-83fb-e943701b2422%40googlegroups.com.
>  
>
>
>
>
>
> -- 
>  (o-    Julien Pivotto 
>  //\    Open-Source Consultant 
>  V_/_   Inuits - https://www.inuits.eu 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/986e63a7-798d-4945-adf6-580f9e48ad4b%40googlegroups.com.

Re: [prometheus-users] Prometheus getting slow on about 400 node_exporter instances

Reply via email to