Thanks Brian, I am in the midst of setting up a slack receiver (to weed out 
the alerts going to the wrong channel). One thing I have noticed is, the 
alerts being routed incorrectly may actually have to do with a rule:

- alert: High_Cpu_Load

expr: 100 - (*avg by(instance,cluster)* 
(rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 95

for: 0m

labels:

severity: warning

annotations:

summary: Host high CPU load (instance {{ $labels.instance }})

description: "CPU load is > 95%\n INSTANCE = {{ $labels.instance }}\n VALUE 
= %{{ $value | humanize }}\n LABELS = {{ $labels }}"

I believe the issue may be that I'm not passing in 'env' into the 
expression and that is causing an issue with the alerts. Just a hunch, but 
I appreciate you pointing me in the right direction!

On Monday, August 22, 2022 at 3:06:47 PM UTC-4 Brian Candler wrote:

> "Looks correct but still doesn't work how I expect"
>
> What you've shown is a target configuration, not an alert arriving at 
> alertmanager.
>
> Therefore, I'm suggesting you take a divide-and-conquer approach.  First, 
> work out which of your receiver routing rules is being triggered (is it the 
> 'production' receiver, or is it the 'slack' receiver?) by making them 
> different.  This will point to which routing rule is or isn't being 
> triggered.  And then you can work out why.
>
> There are all sorts of reasons it might not work, other than the config 
> you've shown.  For example, if you have any target rewriting or metric 
> rewriting rules which set the env; if the exporter itself sets "env" and 
> you have honor_labels set; and so on.
>
> Hence the first part is to find out from real alert events: is the alert 
> being generated without the "dev" label? In that case alert routing is just 
> fine, and you need to work out why that label is wrong (and you're looking 
> at the prometheus side). Or is the alert actually arriving at alertmanager 
> with the "dev" label, in which case you're looking at the alertmanager side 
> to find out why it's not being routed as expected.
>
> On Monday, 22 August 2022 at 18:45:25 UTC+1 rs wrote:
>
>> I checked the json file and the tagging was correct. Here's an example:
>>
>>
>>    {
>>
>>        "labels": {
>>
>>            "cluster": "X Stage Servers",
>>
>>            "env": "dev"
>>
>>        },
>>
>>        "targets": [
>>
>>            "x:9100",
>>
>>            "y:9100",
>>
>>            "z:9100"
>>
>>        ]
>>
>>    },
>> This is being sent to the production/default channel.
>>
>> On Friday, August 12, 2022 at 11:29:34 AM UTC-4 Brian Candler wrote:
>>
>>> Firstly, I'd drop the "continue: true" lines. They are not required, and 
>>> are just going to cause confusion.
>>>
>>> The 'slack' and 'production' receivers are both sending to 
>>> #prod-channel.  So you'll hit this if the env is not exactly "dev".  I 
>>> suggest you look in detail at the alerts themselves: maybe they're tagging 
>>> with "Dev" or "dev " (with a hidden space).
>>>
>>> If you change the default 'slack' receiver to go to a different channel, 
>>> or use a different title/text template, it will be easier to see if this is 
>>> the problem or not.
>>>
>>>
>>> On Friday, 12 August 2022 at 09:36:22 UTC+1 rs wrote:
>>>
>>>> Hi everyone! I am configuring alertmanager to send outputs to a prod 
>>>> slack channel and dev slack channel. I have checked with the routing tree 
>>>> editor and everything should be working correctly. 
>>>> However, I am seeing some (not all) alerts that are tagged with 'env: 
>>>> dev' being sent to the prod slack channel. Is there some sort of old 
>>>> configuration caching happening? Is there a way to flush this out?
>>>>
>>>> --- Alertmanager.yml ---
>>>> global:
>>>>   http_config:
>>>>     proxy_url: 'xyz'
>>>> templates:
>>>>   - templates/*.tmpl
>>>> route:
>>>>   group_by: [cluster,alertname]
>>>>   group_wait: 10s
>>>>   group_interval: 30m
>>>>   repeat_interval: 24h
>>>>   receiver: 'slack'
>>>>   routes:
>>>>   - receiver: 'production'
>>>>     match:
>>>>       env: 'prod'
>>>>     continue: true
>>>>   - receiver: 'staging'
>>>>     match:
>>>>       env: 'dev'
>>>>     continue: true
>>>> receivers:
>>>> #Fallback option - Default set to production server
>>>> - name: 'slack'
>>>>   slack_configs:
>>>>   - api_url: 'api url'
>>>>     channel: '#prod-channel'
>>>>     send_resolved: true
>>>>     color: '{{ template "slack.color" . }}'
>>>>     title: '{{ template "slack.title" . }}'
>>>>     text: '{{ template "slack.text" . }}'
>>>>     actions:
>>>>       - type: button
>>>>         text: 'Query :mag:'
>>>>         url: '{{ (index .Alerts 0).GeneratorURL }}'
>>>>       - type: button
>>>>         text: 'Silence :no_bell:'
>>>>         url: '{{ template "__alert_silence_link" . }}'
>>>>       - type: button
>>>>         text: 'Dashboard :grafana:'
>>>>         url: '{{ (index .Alerts 0).Annotations.dashboard }}'
>>>> - name: 'staging'
>>>>   slack_configs:
>>>>   - api_url: 'api url'
>>>>     channel: '#staging-channel'
>>>>     send_resolved: true
>>>>     color: '{{ template "slack.color" . }}'
>>>>     title: '{{ template "slack.title" . }}'
>>>>     text: '{{ template "slack.text" . }}'
>>>>     actions:
>>>>       - type: button
>>>>         text: 'Query :mag:'
>>>>         url: '{{ (index .Alerts 0).GeneratorURL }}'
>>>>       - type: button
>>>>         text: 'Silence :no_bell:'
>>>>         url: '{{ template "__alert_silence_link" . }}'
>>>>       - type: button
>>>>         text: 'Dashboard :grafana:'
>>>>         url: '{{ (index .Alerts 0).Annotations.dashboard }}'
>>>> - name: 'production'
>>>>   slack_configs:
>>>>   - api_url: 'api url'
>>>>     channel: '#prod-channel'
>>>>     send_resolved: true
>>>>     color: '{{ template "slack.color" . }}'
>>>>     title: '{{ template "slack.title" . }}'
>>>>     text: '{{ template "slack.text" . }}'
>>>>     actions:
>>>>       - type: button
>>>>         text: 'Query :mag:'
>>>>         url: '{{ (index .Alerts 0).GeneratorURL }}'
>>>>       - type: button
>>>>         text: 'Silence :no_bell:'
>>>>         url: '{{ template "__alert_silence_link" . }}'
>>>>       - type: button
>>>>         text: 'Dashboard :grafana:'
>>>>         url: '{{ (index .Alerts 0).Annotations.dashboard }}'
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2033bc26-28ca-48fb-956c-e3a53eab1811n%40googlegroups.com.

Reply via email to