The fundamental problem is how Prometheus can know which containers should be there. Considering your regex, there is an infinite number of containers that are "absent": 0dev-4, 1dev-4, … 9999dev-4, …fjdhrhfksnhdev-4 etc.
To solve this, you need a list of concretely expected containers somewhere. That could be separate alerts if the number is small, or some metric that is there even when the container is stopped. In that case you can use the unless operator: all_expected_containers unless on(name) container_start_time_seconds If there is not already such a metric, you could generate it using recording rules (again requires listing them out but is less verbose), write a small exporter that gets the data from your source of truth, or use container_start_time_seconds offset 15m to look for containers that have been running before and now are not. The downside of this is that it is noisy when a container is expected to go away, and these alerts "resolve" after 15m whether the container is back up or not. /MR On Mon, Mar 8, 2021, 15:39 Tamar <[email protected]> wrote: > Hi, > > I am trying to create an alert for stopped containers. > > If I am using the exact container name I have no problem: > > -* alert: ContainerKilled* > * expr: absent(container_start_time_seconds{name="be-dev-4"})* > * for: 15m* > * labels:* > * severity: 'warning'* > * annotations:* > * summary: 'Container killed'* > * description: 'A container{{ $labels.name <http://labels.name> }} > has disappeared'* > > However, if i am trying to use regexp for the container name (as I have a > few containers with this suffix) , then it fails whatever I try - > If I use this, then no alert is sent: > * - alert: ContainerKilled* > * expr: absent(container_start_time_seconds{** name=~".*dev-4"})* > * for: 15m* > * labels:* > * severity: 'warning'* > > * annotations:* > * summary: 'Container killed'* > * description: 'A container{{ $labels.name <http://labels.name> }} > has disappeared'* > > If I use this, then alert is sent, but without the stopped container name: > - alert: ContainerKilled2 > expr: absent(container_start_time_seconds{*name=~".*dev-4"}*) > for: 15m > labels: > severity: 'warning' > annotations: > summary: 'Container killed' > *description: 'A container has disappeared {{ $labels.instance }} > of job {{ $labels.job }}'* > > Any idea how to alert then with a regexp, *and *the container name? > > Thanks > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/24fe1ee7-3275-4747-93b9-9f0f51821533n%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/24fe1ee7-3275-4747-93b9-9f0f51821533n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAMV%3D_gaYzw8Rx2St0mW3qsEP7He%3DmZK%3DhE9H1iTraZyB3Kcj-w%40mail.gmail.com.

