On Sun, 4 Nov 2012, Mike Julian wrote:
Hey all,I've been racking my brain on how to build a highly-available Nagios infrastructure. Our monitoring is critical to our business, and only having it on one system is certainly a point of failure. The officially-accepted method for doing HA with Nagios is two instances running on the same configuration, both doing the checks, but with notifications turned off on the second one. One downside to this is that it generates double the traffic due to doing all the checks twice. The other downside, which is the more important one: if BoxA goes down and we turn on notifications for BoxB, then when BoxA comes back up, we have to work out some method to bring BoxA's performance data back to current, since there is a gap now. Moving away from Nagios isn't a viable option at this point in time either. Any of you doing a highly-available Nagios environment?
you can have the active nagios server forward the results of the checks to the standby server rather than doing all the checks twice.
When configuring a HA pair, I always try to make the boxes identical and then when the failed box comes back up, I don't have it take over and become active automatically, it remains in standby mode.
In your case, the box in standby mode would start gathering new stats from that point. I would expect that by the time you need to actually generate alerts, it has a long enough baseline.
If you are talking about long-term archived performance stats, rsync the data between the two boxes.
Have the standby box not gather the data, but instead have the active box periodically rsync the archive to the standby box. when a box boots, have it check if the other box is active and rsync the data back the other way.
When a box becomes active, turn on the writing of the archive, along with enabling alerts.
David Lang
_______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
_______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
