*Thanks Brian Candler, for your valuable suggestion. We will try this also.*
On Friday, April 24, 2020 at 2:14:24 PM UTC+5:30, Brian Candler wrote: > > On Friday, 24 April 2020 06:57:08 UTC+1, Srinivasa praveen wrote: >> >> Thanks for the response Stuark. The reason behind keeping the scraping >> interval so long is, on receiving scrape request from Prometheus, my >> exporter performs around 10 queries against database and exposes the result >> as 10 metrics, which will take around 15 minutes to complete all the >> queries. And Prometheus scrape was timing out. So, to increase the >> scrape_timeout I had to increase the scrape_interval also. >> > > I think a better option is: run your slow queries from cron every 30 > minutes, and write the results into a metrics file which is picked up using > node_exporter textfile collector. > > This means you can scrape it as often as you like, including from multiple > prometheus servers for HA. > > Also, textfile collector exposes a metric with the timestamp of the file, > so you can alert if the file isn't being updated for any reason: useful to > spot cronjobs that are persistently failing. > > - name: Hourly > interval: 1h > rules: > - alert: StaleTextFile > expr: time() - node_textfile_mtime_seconds > 7200 > for: 2h > labels: > severity: warning > annotations: > summary: "textfile-collector file has not been updated for more than > 3 hours" > > I also suggest: move the metric file into place only when your slow query > has completed successfully. > > ( > ... > ) >/var/lib/node_exporter/sqlmetrics.prom.new && mv > /var/lib/node_exporter/sqlmetrics.prom.new > /var/lib/node_exporter/sqlmetrics.prom > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/5eddc7e5-19a3-4f57-9a9d-3e531d64611b%40googlegroups.com.

