Hello everyone, I'm relatively new to Prometheus, so your patience is much appreciated.
I'm facing an issue and seeking guidance: I'm working with a metric like CPU usage, where instance identifiers are submitted as labels. To ensure instances are running as expected, I've defined an alert based on this metric. The alert triggers when the aggregation value (in my case, the increase) over a time window falls below an expected threshold. By utilizing the instance identifier as a label, I've streamlined the alert definition to one. So far, I've been successful in achieving this. However, I'm grappling with how to handle instances that have been intentionally shut down. Since the metric value for these instances remains static, the alert consistently fires. How can I address this challenge? Did I make a fundamentally flawed modeling decision? Any insights would be greatly appreciated. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/27d9fa16-66f2-4fb8-9e2e-c125831d7ed2n%40googlegroups.com.