We have a problem that frequently pops up in our monitoring environment. Our monitoring server has adequate bandwidth but we share with an occasional bandwidth hog. Sometimes they steal the bulk of our available bandwidth causing Nagios to be unable to adequately reach the systems it monitors. What results is an onslaught of hundreds and thousands of pages to everyone telling us that systems and services are down... one page for each that is monitored! Sometimes more if the bandwidth is pinched for a prolonged period of time. Legitimate pages are lost in the process.
Is there a way to put a threshold on Nagios if "Time Out" messages are received from a certain number of hosts or services? If 4 distinct sites are timing out at the same time it should stop all notifications for all hosts and services and send a single page indicating a bandwidth issue. Is there any facility in Nagios that even slightly resembles what I've described? TIA! -Scott ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null