I'm monitoring lots of different services on various hosts with Nagios. Many of these things are very useful for me to monitor as the sys admin, but when alerts occur occur with them, they do not represent downtime for our users, they represent problems I should fix proactively before they result in user noticeable downtime. For example, if Nagios notices a fan failure on one of my Procurve switches, I as the admin want to know about the problem, and probably want to replace the failing fan during the next scheduled maintenance time. The switch is still running just fine, however and there is no effect on service to users. Currently when I get such a notification, I'd acknowledge the problem, and it would stay in critical state until I've fixed the problem.
What I'd like to create is a more end user targeted display of Nagios data. It would display OK or Alert status based only on whether the particular service is up or down from the user perspective, and wouldn't show any of the proactive nice for the sys admin to know about details. So in the case of the procurve switch, as long as the fan failure hasn't made the entire switch crash (we can still ping it) it would remain in an OK state. The only way I can think of to accomplish this would be to make a second installation of nagios. It would be a lot of duplicate configuration, but many of the services would be left out. I think that would create this second end user display as I'm imagining it, but it would come at the expense of having to maintain 2 sets of configuration files, and the server would have to do duplicate checking of lots of the services and hosts. Can anyone think of a better way to accomplish this that wouldn't need to involve duplication of checks? ------------------------------------------------------------------------------ _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
