Now this actually works as advertised ;-)

Here are some patches for mon, Client.pm and mon.cgi that add support
for severity levels for alerts. The basic idea is to be able to map the
exit
value of a monitor, or a range of exit values, to an arbitrary severity
level which can then be used by clients such as mon.cgi. The default
severity value for a failure is 1, which is the highest level. In mon.cgi,
I've defined severity levels 1, 2 and 3 for failures of decreasing
severity, and 9 for protocol or monitor failures that don't indicate a
service failure.

This is useful at sites like mine that have relatively unskilled operators
who need to know at a glance how serious a problem is.

So now you can do something like this, for monitors that give back an exit
range (e.g. my hacked version of netsnmp-freespace.monitor):

# Do nothing, but this will show up as yellow in mon.cgi
alert exit=70-80 severity=3 do_nothing.alert

# This shows up as amber in mon.cgi
alert exit=80-90 severity=2 mail.alert  _SYSADMIN_EMAIL_

# This shows up as red in mon.cgi
alert exit=90-100 severity=1 qpage.alert  _ONCALL_PAGER_

# Monitor error; this shows up as purple in mon.cgi
alert exit=2 severity=9 mail.alert _MONADMIN_EMAIL_

You can also prioritize failures using severity levels; for example,
production failures can be severity 1 and development failures can be
severity 2 (or 3, or whatever), and now in mon.cgi these will show up as
different colors regardless of whether alerts have been sent.

There is one problem I haven't resolved, which is how to properly deal with
upalerts. Currently, an upalert will get sent out whenever a service goes
from any failure condition to success, so there is no way to differentiate
who gets upalerts. This is really an issue with the exit code handling
rather than the severity levels which are just a convenient mapping
of the exit codes for clients, but it is a nuisance.


(See attached file: Client.pm.diff)(See attached file: mon.cgi.diff)(See
attached file: mon.diff)

Attachment: Client.pm.diff
Description: Binary data

Attachment: mon.cgi.diff
Description: Binary data

Attachment: mon.diff
Description: Binary data

Reply via email to