RE: avoiding/stopping multiple alerts when upgrading a server poo l

Cook, Nicholas Wed, 29 Oct 2003 13:01:47 -0800

> > One trouble in creating one group per host/service is the 
> shear number of
> > groups you end up with. 
> 
> It is not difficult to generate a mon.cf file automaticaly
> with a list of hosts to be splitted, no ?


Actually, I was meaning from a display point of view.  I use mon.cgi, and
with about 100 groups currently, if I split them out, mon.cgi would be
trying to display over a 1000 groups.  This would not be very user friendly.

> 
> > If you specify 'alertafter 2 30m', service b should not
> > alert after one failure just because service a failed one 
> time 15 minutes
> > ago.  
> 
> But service b does not since services are completely independant
> with their alerts. Did I misunderstand your remark ?
> 
> > Because of these, I would have to agree with the original 
> poster that
> > failures should be tracked at the service/host level, and 
> not the group
> > level.
> 
> alertafter 2 30m
> 1round) A server f1 fails 1 time => no alert
> 2round) A server f2 fails 1 time => alert
> 
> In that case you'd prefer no alert, that's it ?

I mistated it a little above.  Just to clarify.

>From mon.cf
------------------------------------------------------------------------
hostgroup mail_hosts host1 host2 host3

watch mail_hosts
        service sendmail
                description "sendmail monitor"
                interval 5m
                monitor smtp.monitor
                        period wd {Sun-Sat}
                                alert mail.alert [EMAIL PROTECTED]
                                alertafter 2 60m
                                alertevery 30m
                                upalert mail.alert [EMAIL PROTECTED]
------------------------------------------------------------------------

With this setup, currently if host1 fails once, then 15 minutes later, host2
fails once, an alert is sent for host2, even though the host only failed
once.  I know that I can split it up so that each host is in it's own
hostgroup, or I could change smtp.monitor so that it takes a host variable,
and have seperate services for each host (i.e. monitor smtp.monitor -h
host1).  This would keep everything under hostgroup sendmail - even though
the hostgroup members specified would be ignored, but seperate each host as
it's own service.  Neither of these are a good solution however.  A better
solution would be to have mon keep track of failures not by
hostgroup/service (i.e. mail_hosts/sendmail) but instead have it track
failures down one more level (i.e. mail_hosts/sendmail/host1 ).

This would work much better in my particular enviroment.  I could see this
change potentially being an issue for somebody else however.  For this
reason, adding the capabilty by using a flag on the service line might be a
better approach (i.e. service -i sendmail (i for independant)).  This would
keep backwards compatibilty for those that need and/or want the traditional
behavior, while adding the ability to track failures on an independant host
basis, if that is what the person prefers.

Nicholas Cook 
_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

RE: avoiding/stopping multiple alerts when upgrading a server poo l

Reply via email to