On Friday, 24 April 2020 06:57:08 UTC+1, Srinivasa praveen wrote:
>
> Thanks for the response Stuark. The reason behind keeping the scraping 
> interval so long is, on receiving scrape request from Prometheus, my 
> exporter performs around 10 queries against database and exposes the result 
> as 10 metrics, which will take around 15 minutes to complete all the 
> queries. And Prometheus scrape was timing out. So, to increase the 
> scrape_timeout I had to increase the scrape_interval also. 
>

I think a better option is: run your slow queries from cron every 30 
minutes, and write the results into a metrics file which is picked up using 
node_exporter textfile collector.

This means you can scrape it as often as you like, including from multiple 
prometheus servers for HA.

Also, textfile collector exposes a metric with the timestamp of the file, 
so you can alert if the file isn't being updated for any reason: useful to 
spot cronjobs that are persistently failing.

- name: Hourly
  interval: 1h
  rules:
  - alert: StaleTextFile
    expr: time() - node_textfile_mtime_seconds > 7200
    for: 2h
    labels:
      severity: warning
    annotations:
      summary: "textfile-collector file has not been updated for more than 
3 hours"

I also suggest: move the metric file into place only when your slow query 
has completed successfully.

(
...
) >/var/lib/node_exporter/sqlmetrics.prom.new && mv 
/var/lib/node_exporter/sqlmetrics.prom.new 
/var/lib/node_exporter/sqlmetrics.prom

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9086dd88-d95b-45a9-87fb-1a0b8daa8358%40googlegroups.com.

Reply via email to