On Tuesday, 19 February 2019 at 09:58:40 UTC [email protected] wrote:

>   I'm new to prometheus and alertmanager. I'm trying to find a way how
> to setup alertmanager to suppress (inhibit) alerts in network tree 
> structure
> at any level. Something like
>
>    - root switch
>       - host 1
>       - host 2
>       - level 1 switch 1
>       - host 3
>          - host 4
>          - level 2 switch
>             - host 5
>          - level 1 switch 2
>          - host 6
>       
> I want to receive only notification about root switch if it fails (no 
> other host/switch).
> I want to receive only notification about level 1 switch 1 (and no host 
> 3-5 or level 2 switch).
> and so on.
>
> What is the best way? I was thinking about using some prefix form in label
> net (e.g.
> net: root
> net: root_host1,
> net: root_lev2sw
> net: root_lev2sw_host5,
> but I find no way how to use source label in target match. I do not want 
> to write
> static inhibit rule for every switch node.
>

I think you're on the right lines.

Since the inhibit rules can do nothing more sophisticated than "equal" 
matching, I would go with multiple labels to represent levels 1/2/3 etc of 
the hierarchy. The slightly tricky part is to determine the difference 
between parent and child (remembering that one node can be both).

This is what I came up with:

up{instance="coresw1",level1="coresw1"}
up{instance="host1",level1="coresw1",level2="host1"}
up{instance="host2",level1="coresw1",level2="host2"}
up{instance="l1sw1",level1="coresw1",level2="l1sw1""}
up{instance="host3",level1="coresw1",level2="l1sw1",level3="host3"}
up{instance="host4",level1="coresw1",level2="l1sw1",level3="host4"}
up{instance="l2sw1",level1="coresw1",level2="l1sw1",level3="l2sw1"}
up{instance="host5",level1="coresw1",level2="l1sw1",level3="l2sw1",level4="host5"}
up{instance="l1sw2",level1="coresw1",level2="l1sw1"}
up{instance="host6",level1="coresw1",level2="l1sw1",level3="host6"}

The rule is simply that the lowest "level" label is equal to the "instance" 
label, and the "depth" in the tree is equal to the number of "level" labels.

Then inhibit rules something like this:

inhibit_rules:
  - source_matchers:
      - level1=~'.+'
      - level2=''
    target_matchers:
      - level2=~'.+'
    equal: ['level1']
  - source_matchers:
      - level1=~'.+'
      - level2=~'.+'
      - level3=''
    target_matchers:
      - level3=~'.+'
    equal: ['level1','level2']
  - source_matchers:
      - level1=~'.+'
      - level2=~'.+'
      - level3=~'.+'
      - level4=''
    target_matchers:
      - level4=~'.+'
    equal: ['level1','level2','level3']
... etc

This means that:
* An alert with level1="foo" (but no level2, i.e. it's at depth 1 in the 
tree) will suppress any alert for something with depth>1 and level1="foo"
* An alert with level1="foo",level2="bar" (but no level3, i.e. it's at 
depth 2 in the tree)  will suppress any alert for something with depth>2, 
level1="foo" and level2="bar"
* etc

Untested, but you get the idea.  Let me know if something like this works 
for you.

Generating those labels by hand is tedious, but you could write a script 
which reads in a set of targets with "instance" and "parent" attributes, 
and rewrites them to depth/level1/level2 etc.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e5bb9c4c-5d60-4a73-8e39-2ca2238f3dd8n%40googlegroups.com.

Reply via email to