In the alerting rules themselves. e.g.
groups:
- name: UpDown
rules:
- alert: UpDown
expr: up == 0
for: 3m
* labels: severity: critical*
annotations:
summary: 'Scrape failed: host is down or scrape endpoint
down/unreachable'
You can check the currently-firing alerts in the Prometheus web UI (by
default at x.x.x.x:9090). It will show you what labels each alert carries.
On Wednesday, 26 January 2022 at 21:24:08 UTC Hasene Ceren Yıkılmaz wrote:
> Hi!
>
> I send the curl request and get;
>
> alertmanager_notifications_failed_total{integration="opsgenie"} 0
> alertmanager_notifications_total{integration="opsgenie"} 0
>
> So in this case the first thing I control is the alerts I generate from
> alerting rules have the 'severity: critical' or 'severity: warning' labels
> on them right?
> But how can I controll this?
> 17 Ocak 2022 Pazartesi tarihinde saat 17:44:11 UTC+3 itibarıyla Brian
> Candler şunları yazdı:
>
>> 1. Are you sure that the alerts you generate from alerting rules have the
>> 'severity: critical' or 'severity: warning' labels on them? If not, they
>> won't match any of the routes, so they'll fall back to the default you set:
>>
>> route:
>> receiver: slack_general
>>
>> 2. Why do you have these empty routes?
>>
>> - match:
>> severity: critical
>> continue: true
>> - match:
>> severity: warning
>> continue: true
>>
>> They don't do anything - delete them.
>>
>> 3. In order to see if alertmanager is attempting to send to opsgenie (and
>> failing):
>> * Look at the logs of the alertmanager proces (e.g. "journalctl -eu
>> alertmanager" if running it under systemd)
>> * Look at the notification metrics which alertmanager itself generates:
>>
>> curl -Ss localhost:9093/metrics | grep
>> 'alertmanager_notifications.*opsgenie'
>>
>> If you see:
>>
>> alertmanager_notifications_failed_total{integration="opsgenie"} 0
>> alertmanager_notifications_total{integration="opsgenie"} 0
>>
>> then no delivery to opsgenie has been attempted. If there are attempts
>> and failures, you'll see these metrics going up.
>>
>> BTW, it's useful to scrape alertmanager from your prometheus, so you can
>> query these metrics and get history of them (and indeed, alert on them if
>> necessary):
>>
>> - job_name: alertmanager
>> scrape_interval: 1m
>> static_configs:
>> - targets: ['localhost:9093']
>>
>> 4. If you want to deliver a particular alert to multiple destinations, a
>> much cleaner way of doing it is to use a subtree of routes to list multiple
>> destinations:
>>
>> - match:
>> severity: critical
>> routes: [ {receiver: slack_general, continue: true}, {receiver:
>> netmera_opsgenie} ]
>>
>> Then you don't have to duplicate your matching logic (in this case
>> "severity: critical"), and you don't get into confusion over when to use
>> "continue: true".
>>
>> OTOH, if you want *all* alerts to go to slack regardless, then just put a
>> catch-all route at the top:
>>
>> route:
>> receiver: slack_general # this is never used because the first rule
>> below always matches
>> routes:
>>
>> * - receiver: slack_general continue: true*
>> - match:
>>
>> severity: critical
>> receiver: 'netmera_opsgenie'
>> - match:
>> severity: warning
>> receiver: 'netmera_opsgenie'
>>
>>
>> 5. "match" and "match_re" are deprecated, better to start using the new
>> matchers syntax:
>>
>> - matchers:
>> - 'severity =~ "warning|critical"'
>> receiver: 'netmera_opsgenie'
>>
>> On Monday, 17 January 2022 at 12:55:49 UTC Hasene Ceren Yıkılmaz wrote:
>>
>>> When I restart the alertmanager it's running, but I can't see any of
>>> these alerts in OpsGenie.
>>> I follow this doc : https://support.atlassian.com/
>>> opsgenie/docs/integrate-opsgenie-with-prometheus/ and this doc
>>> https://prometheus.io/docs/alerting/latest/configuration/#opsgenie_config
>>> Is there anything to controlling about Alertmanager & OpsGenie
>>> integration?
>>>
>>> This is my alertmanager.yml file;
>>>
>>> global:
>>> resolve_timeout: 5m
>>>
>>> route:
>>> receiver: slack_general
>>> group_by: ['instance']
>>> group_wait: 1s
>>> group_interval: 1s
>>> routes:
>>> - match:
>>> severity: critical
>>> continue: true
>>> receiver: slack_general
>>> - match:
>>> severity: warning
>>> continue: true
>>> receiver: slack_general
>>> - match:
>>> severity: critical
>>> continue: true
>>> - match:
>>> severity: warning
>>> continue: true
>>> # added receivers for opsgenie
>>> - match:
>>> severity: critical
>>> receiver: 'netmera_opsgenie'
>>> - match:
>>> severity: warning
>>> receiver: 'netmera_opsgenie'
>>>
>>> receivers:
>>> - name: slack_general
>>> slack_configs:
>>> - api_url: 'slack api url'
>>> channel: '#netmera-prometheus'
>>> send_resolved: true
>>> title: "{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}"
>>> text: "{{ range .Alerts }}{{ .Annotations.description }}\n{{ end }}"
>>> # added opsgenie configs
>>> - name: 'netmera_opsgenie'
>>> opsgenie_configs:
>>> - api_key: opsgenie api key
>>> api_url: https://api.eu.opsgenie.com/
>>> message: '{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}'
>>> description: '{{ range .Alerts }}{{ .Annotations.description }}\n{{
>>> end }}'
>>> priority: '{{ range .Alerts }}{{ if eq .Labels.severity
>>> "critical"}}P2{{else}}P3{{end}}{{end}}'
>>>
>>> I contacted with OpsGenie support and they checked the logs, but they
>>> couldn't see anything comes from alert manager.
>>> Could you please help me about that?
>>>
>>> Thank you!
>>>
>>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/a2db41db-7d6d-4b77-b070-e1198d0e81ean%40googlegroups.com.