[prometheus-users] Re: Alertmanager configuration: routes

Brian Candler Fri, 03 Sep 2021 01:13:30 -0700

Note that an "alertname" label is added automatically, so you could match 
on alertname="TargetDown" if you want.  Doesn't scale very well, but with a 
small number of rules that approach will get you started.


If you go to your prometheus web interface, at prometheus:9090, and click 
on the "Alerts" tab at the top, then you can see firing alerts, including 
all the labels on them.

[image: img1.png]

On Friday, 3 September 2021 at 09:09:56 UTC+1 Brian Candler wrote:

> The only labels you can match on from that rule are "severity: warning", 
> and the "job" and "instance" labels.
>
> > What must the alertmanager config be for this rule?
>
> You don't need *any* matching rules in alertmanager.  At simplest, you can 
> just have
>
> route:
>   receiver: default
>
> receivers:
> - name: default
>   email_configs:
>   - to: [email protected]
>     send_resolved: true
>   - to: [email protected]
>     send_resolved: true
>
> Any more than that, and it depends on your business requirements.  Do you 
> want all alerts with severity "warning" to be treated differently?  Use a 
> routing rule (in the "routes" section under "route").  Do you want a 
> certain subset of targets to be handled by a particular team? Then either 
> add a label in the alerting rules themselves, or ensure that those targets 
> already have a particular label in their scrape config, and match that 
> label in the "routes" section.
>
> On Friday, 3 September 2021 at 08:20:49 UTC+1 [email protected] wrote:
>
>> It's clear that the config
>> - service=~"mysql|cassandra"
>> does not match the rule.
>> This was just an example.
>>
>> But this question is still open:
>> What must the alertmanager config be for this rule?
>> groups:
>> - name: general.rules
>>   rules:
>>   - alert: TargetDown
>>     annotations:
>>       message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ 
>> $labels.instance
>>         }} instances are down.'
>>     expr: 100 * (count(up == 0) BY (job, instance) / count(up) BY (job,
>>       instance)) > 10
>>     for: 10m
>>     labels:
>>       severity: warning
>>
>> Brian Candler schrieb am Donnerstag, 2. September 2021 um 19:18:37 UTC+2:
>>
>>> Remove the match on service=~"mysql|cassandra" in your routing rule.
>>>
>>> I'm not saying with 100% certainty that your alert *doesn't* have a 
>>> service=xxx label; it's possible that it was added via other means, such as 
>>> external_labels or alert_relabel_configs.  If you go into the prometheus or 
>>> alertmanager web interface, you can see active alerts and their labels, so 
>>> you'll know what you have.
>>>
>>> There was a nice web-based interface for testing alerting rules here:
>>> https://prometheus.io/webtools/alerting/routing-tree-editor/
>>> but it doesn't seem to work properly any more.
>>>
>>> On Thursday, 2 September 2021 at 15:48:57 UTC+1 [email protected] wrote:
>>>
>>>> What should be the configuration in alertmanager.yml to match to the 
>>>> rule?
>>>>
>>>> Brian Candler schrieb am Donnerstag, 2. September 2021 um 15:22:55 
>>>> UTC+2:
>>>>
>>>>> Correct, that expression will only give "job" and "instance" labels.
>>>>>
>>>>> I don't think your alertmanager rule will ever match on this alert.
>>>>>
>>>>> On Thursday, 2 September 2021 at 14:05:22 UTC+1 [email protected] 
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I have defined several rule files, e.g. this general.rules.yml:
>>>>>> groups:
>>>>>> - name: general.rules
>>>>>>   rules:
>>>>>>   - alert: TargetDown
>>>>>>     annotations:
>>>>>>       message: '{{ printf "%.4g" $value }}% of the {{ $labels.job 
>>>>>> }}/{{ $labels.instance
>>>>>>         }} instances are down.'
>>>>>>     expr: 100 * (count(up == 0) BY (job, instance) / count(up) BY 
>>>>>> (job,
>>>>>>       instance)) > 10
>>>>>>     for: 10m
>>>>>>     labels:
>>>>>>       severity: warning
>>>>>>
>>>>>> However, I don't see the correlation to service.
>>>>>>
>>>>>> Brian Candler schrieb am Donnerstag, 2. September 2021 um 13:58:11 
>>>>>> UTC+2:
>>>>>>
>>>>>>> It looks like "service" is a label that you have set in the 
>>>>>>> prometheus alerting rule.
>>>>>>>
>>>>>>> On Thursday, 2 September 2021 at 11:52:20 UTC+1 [email protected] 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> can you please advise what is represented by a service in 
>>>>>>>> alertmanager configuration, e.g.
>>>>>>>> routes: 
>>>>>>>> # All alerts with service=mysql or service=cassandra 
>>>>>>>> # are dispatched to the database pager. - receiver: 
>>>>>>>> 'database-pager' group_wait: 10s matchers: 
>>>>>>>>  - service=~"mysql|cassandra"
>>>>>>>>
>>>>>>>> Where do I find the service in the rules or in Prometheus -> Alerts?
>>>>>>>>
>>>>>>>> THX
>>>>>>>>
>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f0f3f01c-e33c-4e12-a83f-93c1ecdfa97fn%40googlegroups.com.

[prometheus-users] Re: Alertmanager configuration: routes

Reply via email to