If you have shared HA storage you can put the data there so that whichever server is active is also the one logging. Then you don't need the non-active server(s) doing any checks except against the active server, with a event action to become the active server should that fail. You can put the host/service configs on shared storage too, or as part of your configuration engine. On Nov 4, 2012 1:32 PM, <[email protected]> wrote:
> On Sun, 4 Nov 2012, Mike Julian wrote: > > Hey all, >> >> I've been racking my brain on how to build a highly-available Nagios >> infrastructure. Our monitoring is critical to our business, and only >> having >> it on one system is certainly a point of failure. >> >> The officially-accepted method for doing HA with Nagios is two instances >> running on the same configuration, both doing the checks, but with >> notifications turned off on the second one. One downside to this is that >> it >> generates double the traffic due to doing all the checks twice. >> >> The other downside, which is the more important one: if BoxA goes down and >> we turn on notifications for BoxB, then when BoxA comes back up, we have >> to >> work out some method to bring BoxA's performance data back to current, >> since there is a gap now. >> >> Moving away from Nagios isn't a viable option at this point in time >> either. >> >> Any of you doing a highly-available Nagios environment? >> > > you can have the active nagios server forward the results of the checks to > the standby server rather than doing all the checks twice. > > When configuring a HA pair, I always try to make the boxes identical and > then when the failed box comes back up, I don't have it take over and > become active automatically, it remains in standby mode. > > In your case, the box in standby mode would start gathering new stats from > that point. I would expect that by the time you need to actually generate > alerts, it has a long enough baseline. > > If you are talking about long-term archived performance stats, rsync the > data between the two boxes. > > Have the standby box not gather the data, but instead have the active box > periodically rsync the archive to the standby box. when a box boots, have > it check if the other box is active and rsync the data back the other way. > > When a box becomes active, turn on the writing of the archive, along with > enabling alerts. > > David Lang > _______________________________________________ > Discuss mailing list > [email protected] > https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss > This list provided by the League of Professional System Administrators > http://lopsa.org/ > > _______________________________________________ > Discuss mailing list > [email protected] > https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss > This list provided by the League of Professional System Administrators > http://lopsa.org/ > >
_______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
