On Tuesday, 19 February 2019 at 09:58:40 UTC [email protected] wrote:
> I'm new to prometheus and alertmanager. I'm trying to find a way how
> to setup alertmanager to suppress (inhibit) alerts in network tree
> structure
> at any level. Something like
>
> - root switch
> - host 1
> - host 2
> - level 1 switch 1
> - host 3
> - host 4
> - level 2 switch
> - host 5
> - level 1 switch 2
> - host 6
>
> I want to receive only notification about root switch if it fails (no
> other host/switch).
> I want to receive only notification about level 1 switch 1 (and no host
> 3-5 or level 2 switch).
> and so on.
>
> What is the best way? I was thinking about using some prefix form in label
> net (e.g.
> net: root
> net: root_host1,
> net: root_lev2sw
> net: root_lev2sw_host5,
> but I find no way how to use source label in target match. I do not want
> to write
> static inhibit rule for every switch node.
>
I think you're on the right lines.
Since the inhibit rules can do nothing more sophisticated than "equal"
matching, I would go with multiple labels to represent levels 1/2/3 etc of
the hierarchy. The slightly tricky part is to determine the difference
between parent and child (remembering that one node can be both).
This is what I came up with:
up{instance="coresw1",level1="coresw1"}
up{instance="host1",level1="coresw1",level2="host1"}
up{instance="host2",level1="coresw1",level2="host2"}
up{instance="l1sw1",level1="coresw1",level2="l1sw1""}
up{instance="host3",level1="coresw1",level2="l1sw1",level3="host3"}
up{instance="host4",level1="coresw1",level2="l1sw1",level3="host4"}
up{instance="l2sw1",level1="coresw1",level2="l1sw1",level3="l2sw1"}
up{instance="host5",level1="coresw1",level2="l1sw1",level3="l2sw1",level4="host5"}
up{instance="l1sw2",level1="coresw1",level2="l1sw1"}
up{instance="host6",level1="coresw1",level2="l1sw1",level3="host6"}
The rule is simply that the lowest "level" label is equal to the "instance"
label, and the "depth" in the tree is equal to the number of "level" labels.
Then inhibit rules something like this:
inhibit_rules:
- source_matchers:
- level1=~'.+'
- level2=''
target_matchers:
- level2=~'.+'
equal: ['level1']
- source_matchers:
- level1=~'.+'
- level2=~'.+'
- level3=''
target_matchers:
- level3=~'.+'
equal: ['level1','level2']
- source_matchers:
- level1=~'.+'
- level2=~'.+'
- level3=~'.+'
- level4=''
target_matchers:
- level4=~'.+'
equal: ['level1','level2','level3']
... etc
This means that:
* An alert with level1="foo" (but no level2, i.e. it's at depth 1 in the
tree) will suppress any alert for something with depth>1 and level1="foo"
* An alert with level1="foo",level2="bar" (but no level3, i.e. it's at
depth 2 in the tree) will suppress any alert for something with depth>2,
level1="foo" and level2="bar"
* etc
Untested, but you get the idea. Let me know if something like this works
for you.
Generating those labels by hand is tedious, but you could write a script
which reads in a set of targets with "instance" and "parent" attributes,
and rewrites them to depth/level1/level2 etc.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/e5bb9c4c-5d60-4a73-8e39-2ca2238f3dd8n%40googlegroups.com.