Hello, I have an spring batch (java) application that has ~70 different batch jobs in it, duration batch jobs varies from 10 secs to 5 mins. each batch job runs in separate JVM. spring batch provides metrics via micrometer and I have plugged Prometheus as my metrics vendor. my application sends metrics to pushgateway.
Question 1 - what should be my deletion policy to delete stale metrics from pushgateway? should it be after my batches are complete for the day? batches runs for 2-3 hours Question 2 - I want to define an email alert if any of batch fails. although spring batch provide this metric but it is not working. So i defined a Counter that is available to me in prometheus like this --> app_job_status_total{status="FAILED"} 1. Problem is it always gives me the same value. Using functions increase() or rate() does not help as well. as the value of metric once set is not changing over the evaluated interval. Please advice -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/prometheus-developers/3c0d3c28-6d5b-4294-b585-54fdb6b9ca4dn%40googlegroups.com.