On May 27, 2008, at 10:02 AM, Germán Gutiérrez wrote: > I think I'm not the only one with this issue, but I couldn't find any > documented solution. > > We have a group of servers, sometimes, for a common reason, a service > goes down almost simultaneously and we get around 30 alerts about the > same thing. >
> Any thoughts? Links? Clues? RTFM? Simplest thing seems to be to monitor that thing that's breaking and use service dependencies to make the services above dependent on the newly monitored service. If you can't monitor that thing, it's a bit more complicated. You want to normally receive notifications for the service unless some certain threshold count of them is reached. check_cluster could be useful here by making all the services above dependent on a cluster service check. If you set the check cluster threshold to say 5, I'd expect that you'd receive at most 5(ish) notifications (4 for per-service notifications + 1 for check_cluster itself). -- Marc ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null