If you have shared HA storage you can put the data there so that whichever
server is active is also the one logging. Then you don't need the
non-active server(s) doing any checks except against the active server,
with a event action to become the active server should that fail. You can
put the host/service configs on shared storage too, or as part of your
configuration engine.
On Nov 4, 2012 1:32 PM, <[email protected]> wrote:

> On Sun, 4 Nov 2012, Mike Julian wrote:
>
>  Hey all,
>>
>> I've been racking my brain on how to build a highly-available Nagios
>> infrastructure. Our monitoring is critical to our business, and only
>> having
>> it on one system is certainly a point of failure.
>>
>> The officially-accepted method for doing HA with Nagios is two instances
>> running on the same configuration, both doing the checks, but with
>> notifications turned off on the second one. One downside to this is that
>> it
>> generates double the traffic due to doing all the checks twice.
>>
>> The other downside, which is the more important one: if BoxA goes down and
>> we turn on notifications for BoxB, then when BoxA comes back up, we have
>> to
>> work out some method to bring BoxA's performance data back to current,
>> since there is a gap now.
>>
>> Moving away from Nagios isn't a viable option at this point in time
>> either.
>>
>> Any of you doing a highly-available Nagios environment?
>>
>
> you can have the active nagios server forward the results of the checks to
> the standby server rather than doing all the checks twice.
>
> When configuring a HA pair, I always try to make the boxes identical and
> then when the failed box comes back up, I don't have it take over and
> become active automatically, it remains in standby mode.
>
> In your case, the box in standby mode would start gathering new stats from
> that point. I would expect that by the time you need to actually generate
> alerts, it has a long enough baseline.
>
> If you are talking about long-term archived performance stats, rsync the
> data between the two boxes.
>
> Have the standby box not gather the data, but instead have the active box
> periodically rsync the archive to the standby box. when a box boots, have
> it check if the other box is active and rsync the data back the other way.
>
> When a box becomes active, turn on the writing of the archive, along with
> enabling alerts.
>
> David Lang
> _______________________________________________
> Discuss mailing list
> [email protected]
> https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>
> _______________________________________________
> Discuss mailing list
> [email protected]
> https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>
>
_______________________________________________
Discuss mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to