[prometheus-users] Re: Alertmanager configuration: routes

Thomas Schneider Fri, 03 Sep 2021 01:26:40 -0700

This means
alert in Prometheus - Rules config
is equal to
service in Prometheus - Alertmanager config
?


Brian Candler schrieb am Freitag, 3. September 2021 um 10:13:24 UTC+2:

> Note that an "alertname" label is added automatically, so you could match 
> on alertname="TargetDown" if you want.  Doesn't scale very well, but with a 
> small number of rules that approach will get you started.
>
> If you go to your prometheus web interface, at prometheus:9090, and click 
> on the "Alerts" tab at the top, then you can see firing alerts, including 
> all the labels on them.
>
> [image: img1.png]
>
> On Friday, 3 September 2021 at 09:09:56 UTC+1 Brian Candler wrote:
>
>> The only labels you can match on from that rule are "severity: warning", 
>> and the "job" and "instance" labels.
>>
>> > What must the alertmanager config be for this rule?
>>
>> You don't need *any* matching rules in alertmanager.  At simplest, you 
>> can just have
>>
>> route:
>>   receiver: default
>>
>> receivers:
>> - name: default
>>   email_configs:
>>   - to: [email protected]
>>     send_resolved: true
>>   - to: [email protected]
>>     send_resolved: true
>>
>> Any more than that, and it depends on your business requirements.  Do you 
>> want all alerts with severity "warning" to be treated differently?  Use a 
>> routing rule (in the "routes" section under "route").  Do you want a 
>> certain subset of targets to be handled by a particular team? Then either 
>> add a label in the alerting rules themselves, or ensure that those targets 
>> already have a particular label in their scrape config, and match that 
>> label in the "routes" section.
>>
>> On Friday, 3 September 2021 at 08:20:49 UTC+1 [email protected] wrote:
>>
>>> It's clear that the config
>>> - service=~"mysql|cassandra"
>>> does not match the rule.
>>> This was just an example.
>>>
>>> But this question is still open:
>>> What must the alertmanager config be for this rule?
>>> groups:
>>> - name: general.rules
>>>   rules:
>>>   - alert: TargetDown
>>>     annotations:
>>>       message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ 
>>> $labels.instance
>>>         }} instances are down.'
>>>     expr: 100 * (count(up == 0) BY (job, instance) / count(up) BY (job,
>>>       instance)) > 10
>>>     for: 10m
>>>     labels:
>>>       severity: warning
>>>
>>> Brian Candler schrieb am Donnerstag, 2. September 2021 um 19:18:37 UTC+2:
>>>
>>>> Remove the match on service=~"mysql|cassandra" in your routing rule.
>>>>
>>>> I'm not saying with 100% certainty that your alert *doesn't* have a 
>>>> service=xxx label; it's possible that it was added via other means, such 
>>>> as 
>>>> external_labels or alert_relabel_configs.  If you go into the prometheus 
>>>> or 
>>>> alertmanager web interface, you can see active alerts and their labels, so 
>>>> you'll know what you have.
>>>>
>>>> There was a nice web-based interface for testing alerting rules here:
>>>> https://prometheus.io/webtools/alerting/routing-tree-editor/
>>>> but it doesn't seem to work properly any more.
>>>>
>>>> On Thursday, 2 September 2021 at 15:48:57 UTC+1 [email protected] 
>>>> wrote:
>>>>
>>>>> What should be the configuration in alertmanager.yml to match to the 
>>>>> rule?
>>>>>
>>>>> Brian Candler schrieb am Donnerstag, 2. September 2021 um 15:22:55 
>>>>> UTC+2:
>>>>>
>>>>>> Correct, that expression will only give "job" and "instance" labels.
>>>>>>
>>>>>> I don't think your alertmanager rule will ever match on this alert.
>>>>>>
>>>>>> On Thursday, 2 September 2021 at 14:05:22 UTC+1 [email protected] 
>>>>>> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I have defined several rule files, e.g. this general.rules.yml:
>>>>>>> groups:
>>>>>>> - name: general.rules
>>>>>>>   rules:
>>>>>>>   - alert: TargetDown
>>>>>>>     annotations:
>>>>>>>       message: '{{ printf "%.4g" $value }}% of the {{ $labels.job 
>>>>>>> }}/{{ $labels.instance
>>>>>>>         }} instances are down.'
>>>>>>>     expr: 100 * (count(up == 0) BY (job, instance) / count(up) BY 
>>>>>>> (job,
>>>>>>>       instance)) > 10
>>>>>>>     for: 10m
>>>>>>>     labels:
>>>>>>>       severity: warning
>>>>>>>
>>>>>>> However, I don't see the correlation to service.
>>>>>>>
>>>>>>> Brian Candler schrieb am Donnerstag, 2. September 2021 um 13:58:11 
>>>>>>> UTC+2:
>>>>>>>
>>>>>>>> It looks like "service" is a label that you have set in the 
>>>>>>>> prometheus alerting rule.
>>>>>>>>
>>>>>>>> On Thursday, 2 September 2021 at 11:52:20 UTC+1 [email protected] 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> can you please advise what is represented by a service in 
>>>>>>>>> alertmanager configuration, e.g.
>>>>>>>>> routes: 
>>>>>>>>> # All alerts with service=mysql or service=cassandra 
>>>>>>>>> # are dispatched to the database pager. - receiver: 
>>>>>>>>> 'database-pager' group_wait: 10s matchers: 
>>>>>>>>>  - service=~"mysql|cassandra"
>>>>>>>>>
>>>>>>>>> Where do I find the service in the rules or in Prometheus -> 
>>>>>>>>> Alerts?
>>>>>>>>>
>>>>>>>>> THX
>>>>>>>>>
>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/06ad91ff-987a-49c5-89cf-713c67905268n%40googlegroups.com.

[prometheus-users] Re: Alertmanager configuration: routes

Reply via email to