Re: [Nagios-users] High Availabilty with Nagios

2013-05-10 Thread Andreas Ericsson
On 2013-05-09 11:19, Steve Shipway wrote:
 Does anyone have an HA setup for Nagios that works?

 I'm thinking of creating a NEB module that will link two Nagios
 setups, and replicate over all status changes, config changes,
 downtime, comments, etc etc and then set the 'standby' Nagios to be
 checks/notifications disabled when in standby mode, and enabled when
 in active mode.  Then put the two behind a failover load balancer
 (F5, Foundry or apache reverse proxy).

 However this would be too much work if someone else has already found
 an equivalent solution.

 I've looked at Merlin but it doesn't seem to do what I'm after (and
 the documentation is practically nonexistant - much the same as the
 NEB API documentation, in fact).  Mod_gearman lets me have redundant
 checks and replicate *active* checks, but not commands, downtime or
passive checks.


Merlin would do exactly that if you set one of the nodes as a poller
but having all hosts assigned to it. When the poller goes down, the
master will by default take over checks for it.

Merlin is actually pretty well documented, but as textfiles that you
have to read the oldschool way. If there's anything you find lacking
from the HOWTO document or the README, please let me know and I'll
amend it.


 Does anyone out there have a workable way to get an active/standby or
 active/active Nagios setup?  Would be interested in hearing all
 ideas...


Well, we have about 800 of them.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] High Availabilty with Nagios

2013-05-10 Thread Andreas Ericsson
On 2013-05-09 11:50, Supporto Tecnico - Crazy Network wrote:
 I would be interested too, i'm actually using merlind for this right
 now, but i would like to dont have for example double notifications if a
 server goes down.. and i do want both nagios set for notify, since if
 one is down (for any reason) the other one should be able to check and
 notify and vice-versa


Double notifications is a bug, unless you send passive checkresults to
both masters, in which case it's by design. Usually people want to solve
passive checks by arranging a single target ip or hostname to send to
and then add peered nodes at that tier as necessary, so as to not have
to send checkresults to multiple nodes from all the monitored machines.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] High Availabilty with Nagios

2013-05-09 Thread William Leibzon
On Thu, May 9, 2013 at 2:19 AM, Steve Shipway s.ship...@auckland.ac.nz wrote:
 Does anyone have an HA setup for Nagios that works?

 I'm thinking of creating a NEB module that will link two Nagios setups, and
 replicate over all status changes, config changes, downtime, comments, etc
 etc and then set the 'standby' Nagios to be checks/notifications disabled
 when in standby mode, and enabled when in active mode.  Then put the two
 behind a failover load balancer (F5, Foundry or apache reverse proxy).

I've thought several times of doing it but never actually get started
although I have it all planned out kinda like you.

In the mean time my HA setup which I've done for several customers
involves config synced using git or svn (script run by cron that
checks if its something new and then restart nagios if config passes
tests). Both servers doing checks but config is such that for one
server all notifications are disabled except for cross-checking of the
other nagios This is achieved by having common template from which all
services are derived and this template is in a file specific to each
server and so one has notifications disabled and the other enabled.
This is not a full HA in a way that if one server dies you have to
execute a script that would enable the other servers for notifications
(this can be done automatically too but I prefer people to do it).

 However this would be too much work if someone else has already found an
 equivalent solution.

 I've looked at Merlin but it doesn't seem to do what I'm after (and the
 documentation is practically nonexistant - much the same as the NEB API
 documentation, in fact).  Mod_gearman lets me have redundant checks and
 replicate *active* checks, but not commands, downtime or passive checks.

 Does anyone out there have a workable way to get an active/standby or
 active/active Nagios setup?  Would be interested in hearing all ideas...

 Steve


 Steve Shipway
 University of Auckland ITS
 UNIX Systems Design Lead
 s.ship...@auckland.ac.nz
 Ph: +64 9 373 7599 ext 86487


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and
 their applications. This 200-page book is written by three acclaimed
 leaders in the field. The early access version is available now.
 Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] High Availabilty with Nagios

2013-05-09 Thread Edward St Pierre
Hi,

I have done this before using drbd for block based replication and
clustering on Redhat, this also could be done with pacemaker/corrosync
clusters also.

Ed


On 9 May 2013 10:51, William Leibzon will...@leibzon.org wrote:

 On Thu, May 9, 2013 at 2:19 AM, Steve Shipway s.ship...@auckland.ac.nz
 wrote:
  Does anyone have an HA setup for Nagios that works?
 
  I'm thinking of creating a NEB module that will link two Nagios setups,
 and
  replicate over all status changes, config changes, downtime, comments,
 etc
  etc and then set the 'standby' Nagios to be checks/notifications disabled
  when in standby mode, and enabled when in active mode.  Then put the two
  behind a failover load balancer (F5, Foundry or apache reverse proxy).

 I've thought several times of doing it but never actually get started
 although I have it all planned out kinda like you.

 In the mean time my HA setup which I've done for several customers
 involves config synced using git or svn (script run by cron that
 checks if its something new and then restart nagios if config passes
 tests). Both servers doing checks but config is such that for one
 server all notifications are disabled except for cross-checking of the
 other nagios This is achieved by having common template from which all
 services are derived and this template is in a file specific to each
 server and so one has notifications disabled and the other enabled.
 This is not a full HA in a way that if one server dies you have to
 execute a script that would enable the other servers for notifications
 (this can be done automatically too but I prefer people to do it).

  However this would be too much work if someone else has already found an
  equivalent solution.
 
  I've looked at Merlin but it doesn't seem to do what I'm after (and the
  documentation is practically nonexistant - much the same as the NEB API
  documentation, in fact).  Mod_gearman lets me have redundant checks and
  replicate *active* checks, but not commands, downtime or passive checks.
 
  Does anyone out there have a workable way to get an active/standby or
  active/active Nagios setup?  Would be interested in hearing all ideas...
 
  Steve
 
 
  Steve Shipway
  University of Auckland ITS
  UNIX Systems Design Lead
  s.ship...@auckland.ac.nz
  Ph: +64 9 373 7599 ext 86487
 
 
 
 --
  Learn Graph Databases - Download FREE O'Reilly Book
  Graph Databases is the definitive new guide to graph databases and
  their applications. This 200-page book is written by three acclaimed
  leaders in the field. The early access version is available now.
  Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
  ___
  Nagios-users mailing list
  Nagios-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/nagios-users
  ::: Please include Nagios version, plugin version (-v) and OS when
 reporting
  any issue.
  ::: Messages without supporting info will risk being sent to /dev/null


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and
 their applications. This 200-page book is written by three acclaimed
 leaders in the field. The early access version is available now.
 Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] High Availabilty with Nagios

2013-05-09 Thread Andrew Widdersheim
I did a talk at last years conference that touches on HA Nagios setup which 
uses DRBD and pacemaker. There were also talks about mod_gearman and Merlin 
that might also be helpful. The slides (and maybe video?) are available on 
nagios.org. Here is a link to my slides:

http://www.slideshare.net/nagiosinc/andrew-widdersheim-nagiosisdownbosswantstosee-you
 
--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] High Availabilty with Nagios

2013-05-09 Thread frank
While HA can be a great thing I've always been of the opinion that a 
monitoring setup needs to have as few moving parts as possible. The more 
complexity to the monitor, the more chance you'll be chasing monitoring 
issues rather than site issues. And everthing you add on top of the 
monitor also needs to be monitored. So somehow that F5 is going to need an 
out-of-band monitor because if it dies then your Nagios host may well not 
have a way to contact you about it unless you've dual homed it which 
brings up a whole other set of issues.


The closest I got to HA at my last gig was creating a CNAME for the active 
Nagios host so in a failover you point the CNAME to the new box and at 
least passive checks can still roll in (after DNS timeout of course, which 
I say is better than reconfiging every NSCA clent).


-f

On Thu, 9 May 2013, Steve Shipway wrote:


Date: Thu, 9 May 2013 09:19:17 +
From: Steve Shipway s.ship...@auckland.ac.nz
Reply-To: Nagios Users List nagios-users@lists.sourceforge.net
To: nagios-users@lists.sourceforge.net nagios-users@lists.sourceforge.net
Subject: [Nagios-users] High Availabilty with Nagios

Does anyone have an HA setup for Nagios that works?

I'm thinking of creating a NEB module that will link two Nagios setups, and 
replicate over all
status changes, config changes, downtime, comments, etc etc and then set the 
'standby' Nagios to
be checks/notifications disabled when in standby mode, and enabled when in 
active mode.  Then
put the two behind a failover load balancer (F5, Foundry or apache reverse 
proxy).

However this would be too much work if someone else has already found an 
equivalent solution.

I've looked at Merlin but it doesn't seem to do what I'm after (and the 
documentation is
practically nonexistant - much the same as the NEB API documentation, in fact). 
 Mod_gearman
lets me have redundant checks and replicate *active* checks, but not commands, 
downtime or
passive checks.

Does anyone out there have a workable way to get an active/standby or 
active/active Nagios
setup?  Would be interested in hearing all ideas...

Steve


Steve Shipway
University of Auckland ITS
UNIX Systems Design Lead
s.ship...@auckland.ac.nz
Ph: +64 9 373 7599 ext 86487
 

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] High Availabilty with Nagios

2013-05-09 Thread Jim Winkle
On 05/09/13, Steve Shipway  wrote:

 Does anyone have an HA setup for Nagios that works?
 
 I'm thinking of creating a NEB module that will link two Nagios setups, and 
 replicate over all status changes, config changes, downtime, comments, etc 
 etc and then set the 'standby' Nagios to be checks/notifications disabled 
 when in standby mode, and enabled when in active mode. Then put the two 
 behind a failover load balancer (F5, Foundry or apache reverse proxy).

We use rsync (run out of cron every minute) and a floating VIP between two 
hosts. Nagios is running on only one host at a time. It's a trivial (manual) 
process to switch between hosts.

Files which are synced: all Nagios files except logs and transient results. 
Files synced include Nagios configs, binaries and CGIs, helper apps, plugins, 
local plugins and NRPE configs, docs, HTML files, status files, all files in 
~nagios, and the crontab for user nagios. 

-- Jim

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null