We need to generate an alert  - via Prometheus snmp_exporter metrics - when 
less than 80% of the nodes on our active bigip F5 load balancer is up (i.e. 
).  I think we have the percentage of up hosts, but am not sure how to 
ensure that we are only alerting on the active F5 load balancer node.  In 
the snmp_exporter each F5 node is a distinct instance label name.

Here are the two metrics in question.
host up metric: ltmPoolMemberMonitorState = 4
f5 node active metric: sysCmFailoverStatusId = 4

Below are counting the number of ltmPoolMemberNodeName with a 
ltmPoolMemberNodeName that includes "prod" that are up, divided by the 
total number of ltmPoolMemberNodeName.  Then we appended the OR operator to 
provide a 0 when all hosts are in a down state (i.e. 
ltmPoolMemberMonitorState is not 4). See below:

count(count by (ltmPoolMemberNodeName) 
(ltmPoolMemberMonitorState{ltmPoolMemberNodeName=~".*prod.*"} == 4)) / 
count(count by (ltmPoolMemberNodeName) 
(ltmPoolMemberMonitorState{ltmPoolMemberNodeName=~".*prod.*"})) OR on() 
vector(0)

Now we need to ensure that we are only deriving the calculation from the 
active f5 node instance metrics (i.e. when the metric sysCmFailoverStatusId 
is equal to 4 for a particular instance).  I tried with (instance) and on 
(instance) to keep the metrics on same F5 node instance label, but haven't 
had any luck.  Any recommendations would be greatly appreciated. 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ba41c4cb-2bbf-4b79-a455-c19d4a1a4842n%40googlegroups.com.

Reply via email to