Re: [Nagios-users] distributed monitoring - slave server not that intelligent

Andreas Ericsson Fri, 15 Feb 2008 00:53:12 -0800

mark redding wrote:

Hi all,


I currently have Nagios 2.10 installed on a couple of machines, one of
which is configured as a master and the other as a slave.

I have a script running on the slave which rsync's up the configs from
the master and performs health checks of the master to see that it is
running (and if it is not then it enables service checks/notifications
on the slave until such time as it detects that the master is back up
and running). I also use nsca to pass passive checks to the slave to
ensure that it has up to date information about services. The slave
does not perform any active service checks, nor are notifications
enabled unless the master is down.

I do however still have one problem and that is that the slave has no
way of knowing when we're ack'ed a critical, scheduled downtime,
disabled/enabled notfications/event handlers/checks for a service/host
on the master. What this means is that if we schedule downtime on a
host, then the master goes down, the slave starts bitching about the
host that is down (because it does not know that it's in downtime). A
similar problem occurs if we disable an event handler on the master,
because unless the slave also knows to disable the event handler it
will fire it (regardless of whether or not it is active) as soon as
the passive check result returns a critical.

At present I am getting round this by tailing the nagios log file
through a perl script that looks for specific 'EXTERNAL COMMAND'
entries and then flushes those through to the slave by ssh'ing to the
slave and writing the command string to the nagios pipe file on the
slave.

Is there a better way of doing this ?


You might get lucky using the attached NEB-module. It's not well
documented, and it's not very well tested. It will do what you're
after though. Contact me off-list if you run into problems. I've
been looking for someone to test this for quite some time now, so
I'll be happy to help.

It's written to make the two servers loadbalanced, so the slave
and the master will help each other out doing checks and then
transmit them to one another. External commands are also copied
from one to the other, so scheduled/cancelled downtime etc will
instantly show up on both servers as soon as its parsed in one.

If you don't want the host/service check syncing you'll have to
either get clever with the config or manually hack that out of
the module.

Like I said; Feel free to contact me off-list if you're having
any problems with it.

--
Andreas Ericsson                   [EMAIL PROTECTED]
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

mrm-0.1.tar.gz
Description: GNU Zip compressed data

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] distributed monitoring - slave server not that intelligent

Reply via email to