Hey all, I've been racking my brain on how to build a highly-available Nagios infrastructure. Our monitoring is critical to our business, and only having it on one system is certainly a point of failure.
The officially-accepted method for doing HA with Nagios is two instances running on the same configuration, both doing the checks, but with notifications turned off on the second one. One downside to this is that it generates double the traffic due to doing all the checks twice. The other downside, which is the more important one: if BoxA goes down and we turn on notifications for BoxB, then when BoxA comes back up, we have to work out some method to bring BoxA's performance data back to current, since there is a gap now. Moving away from Nagios isn't a viable option at this point in time either. Any of you doing a highly-available Nagios environment?
_______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
