[prometheus-users] prometheus-cloudwatch-exporter stops working while scraping larger number of metrics

mordowiciel Sun, 10 May 2020 11:02:17 -0700

Hi everyone!

I'm having a problem with setting up prometheus-cloudwatch-exporter on 
Kubernetes cluster. I'm installing it from the following helm chart 
<https://github.com/helm/charts/tree/master/stable/prometheus-cloudwatch-exporter>
.


The config for exporter looks as follows:
region: eu-west-1
metrics:
  - aws_namespace: AWS/RDS
    aws_metric_name: CPUUtilization
    aws_dimensions: [DBInstanceIdentifier]
    aws_statistics: [Average]
  - aws_namespace: AWS/RDS
    aws_metric_name: DatabaseConnections
    aws_dimensions: [DBInstanceIdentifier]
    aws_statistics: [Average]
  - aws_namespace: AWS/RDS
    aws_metric_name: FreeableMemory
    aws_dimensions: [DBInstanceIdentifier]
    aws_statistics: [Average]
  - aws_namespace: AWS/RDS
    aws_metric_name: ReadIOPS
    aws_dimensions: [DBInstanceIdentifier]
    aws_statistics: [Average]
  - aws_namespace: AWS/RDS
    aws_metric_name: WriteIOPS
    aws_dimensions: [DBInstanceIdentifier]
    aws_statistics: [Average]
 

After the deployment, when I'm trying to access the* /metrics* endpoint of 
the exporter, the query takes a very long time - sometimes I'm getting a 
timeout, and sometimes I'm able to get the response after 30-40s. I'm also 
unable to query the metrics from the Prometheus console (the query returns *no 
data* response).

However, when I reduce the number of gathered metrics, for example to the 
following form:
region: eu-west-1
metrics:
  - aws_namespace: AWS/RDS
    aws_metric_name: CPUUtilization
    aws_dimensions: [DBInstanceIdentifier]
    aws_statistics: [Average]


The /metrics endpoint always provides the response in ~5s and I can see the 
scraped cpuutilzation metric in Prometheus console.

I've looked at the exporter and Prometheus logs and I didn't find anything 
interesting there - no stacktraces, errors etc.

For every metric I've provided above, the Cloudwatch API returns ~450 
DBInstanceIdentifiers. It looks like the exporter is becoming overloaded 
when I try to query for the full set of the provided RDS metrics. Did 
anyone encounter the similar problem? Is it somehow possible to "scale" the 
exporter so it would handle scraping larger amounts of Cloudwatch data?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/edddb6e8-816b-4e4c-af53-564f661d4ba7%40googlegroups.com.

[prometheus-users] prometheus-cloudwatch-exporter stops working while scraping larger number of metrics

Reply via email to