Re: [Nagios-users] High Availabilty with Nagios
On 2013-05-09 11:19, Steve Shipway wrote: Does anyone have an HA setup for Nagios that works? I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode. Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy). However this would be too much work if someone else has already found an equivalent solution. I've looked at Merlin but it doesn't seem to do what I'm after (and the documentation is practically nonexistant - much the same as the NEB API documentation, in fact). Mod_gearman lets me have redundant checks and replicate *active* checks, but not commands, downtime or passive checks. Merlin would do exactly that if you set one of the nodes as a poller but having all hosts assigned to it. When the poller goes down, the master will by default take over checks for it. Merlin is actually pretty well documented, but as textfiles that you have to read the oldschool way. If there's anything you find lacking from the HOWTO document or the README, please let me know and I'll amend it. Does anyone out there have a workable way to get an active/standby or active/active Nagios setup? Would be interested in hearing all ideas... Well, we have about 800 of them. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] High Availabilty with Nagios
On 2013-05-09 11:50, Supporto Tecnico - Crazy Network wrote: I would be interested too, i'm actually using merlind for this right now, but i would like to dont have for example double notifications if a server goes down.. and i do want both nagios set for notify, since if one is down (for any reason) the other one should be able to check and notify and vice-versa Double notifications is a bug, unless you send passive checkresults to both masters, in which case it's by design. Usually people want to solve passive checks by arranging a single target ip or hostname to send to and then add peered nodes at that tier as necessary, so as to not have to send checkresults to multiple nodes from all the monitored machines. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] High Availabilty with Nagios
On Thu, May 9, 2013 at 2:19 AM, Steve Shipway s.ship...@auckland.ac.nz wrote: Does anyone have an HA setup for Nagios that works? I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode. Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy). I've thought several times of doing it but never actually get started although I have it all planned out kinda like you. In the mean time my HA setup which I've done for several customers involves config synced using git or svn (script run by cron that checks if its something new and then restart nagios if config passes tests). Both servers doing checks but config is such that for one server all notifications are disabled except for cross-checking of the other nagios This is achieved by having common template from which all services are derived and this template is in a file specific to each server and so one has notifications disabled and the other enabled. This is not a full HA in a way that if one server dies you have to execute a script that would enable the other servers for notifications (this can be done automatically too but I prefer people to do it). However this would be too much work if someone else has already found an equivalent solution. I've looked at Merlin but it doesn't seem to do what I'm after (and the documentation is practically nonexistant - much the same as the NEB API documentation, in fact). Mod_gearman lets me have redundant checks and replicate *active* checks, but not commands, downtime or passive checks. Does anyone out there have a workable way to get an active/standby or active/active Nagios setup? Would be interested in hearing all ideas... Steve Steve Shipway University of Auckland ITS UNIX Systems Design Lead s.ship...@auckland.ac.nz Ph: +64 9 373 7599 ext 86487 -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] High Availabilty with Nagios
Hi, I have done this before using drbd for block based replication and clustering on Redhat, this also could be done with pacemaker/corrosync clusters also. Ed On 9 May 2013 10:51, William Leibzon will...@leibzon.org wrote: On Thu, May 9, 2013 at 2:19 AM, Steve Shipway s.ship...@auckland.ac.nz wrote: Does anyone have an HA setup for Nagios that works? I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode. Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy). I've thought several times of doing it but never actually get started although I have it all planned out kinda like you. In the mean time my HA setup which I've done for several customers involves config synced using git or svn (script run by cron that checks if its something new and then restart nagios if config passes tests). Both servers doing checks but config is such that for one server all notifications are disabled except for cross-checking of the other nagios This is achieved by having common template from which all services are derived and this template is in a file specific to each server and so one has notifications disabled and the other enabled. This is not a full HA in a way that if one server dies you have to execute a script that would enable the other servers for notifications (this can be done automatically too but I prefer people to do it). However this would be too much work if someone else has already found an equivalent solution. I've looked at Merlin but it doesn't seem to do what I'm after (and the documentation is practically nonexistant - much the same as the NEB API documentation, in fact). Mod_gearman lets me have redundant checks and replicate *active* checks, but not commands, downtime or passive checks. Does anyone out there have a workable way to get an active/standby or active/active Nagios setup? Would be interested in hearing all ideas... Steve Steve Shipway University of Auckland ITS UNIX Systems Design Lead s.ship...@auckland.ac.nz Ph: +64 9 373 7599 ext 86487 -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] High Availabilty with Nagios
I did a talk at last years conference that touches on HA Nagios setup which uses DRBD and pacemaker. There were also talks about mod_gearman and Merlin that might also be helpful. The slides (and maybe video?) are available on nagios.org. Here is a link to my slides: http://www.slideshare.net/nagiosinc/andrew-widdersheim-nagiosisdownbosswantstosee-you -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] High Availabilty with Nagios
While HA can be a great thing I've always been of the opinion that a monitoring setup needs to have as few moving parts as possible. The more complexity to the monitor, the more chance you'll be chasing monitoring issues rather than site issues. And everthing you add on top of the monitor also needs to be monitored. So somehow that F5 is going to need an out-of-band monitor because if it dies then your Nagios host may well not have a way to contact you about it unless you've dual homed it which brings up a whole other set of issues. The closest I got to HA at my last gig was creating a CNAME for the active Nagios host so in a failover you point the CNAME to the new box and at least passive checks can still roll in (after DNS timeout of course, which I say is better than reconfiging every NSCA clent). -f On Thu, 9 May 2013, Steve Shipway wrote: Date: Thu, 9 May 2013 09:19:17 + From: Steve Shipway s.ship...@auckland.ac.nz Reply-To: Nagios Users List nagios-users@lists.sourceforge.net To: nagios-users@lists.sourceforge.net nagios-users@lists.sourceforge.net Subject: [Nagios-users] High Availabilty with Nagios Does anyone have an HA setup for Nagios that works? I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode. Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy). However this would be too much work if someone else has already found an equivalent solution. I've looked at Merlin but it doesn't seem to do what I'm after (and the documentation is practically nonexistant - much the same as the NEB API documentation, in fact). Mod_gearman lets me have redundant checks and replicate *active* checks, but not commands, downtime or passive checks. Does anyone out there have a workable way to get an active/standby or active/active Nagios setup? Would be interested in hearing all ideas... Steve Steve Shipway University of Auckland ITS UNIX Systems Design Lead s.ship...@auckland.ac.nz Ph: +64 9 373 7599 ext 86487 -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] High Availabilty with Nagios
On 05/09/13, Steve Shipway wrote: Does anyone have an HA setup for Nagios that works? I'm thinking of creating a NEB module that will link two Nagios setups, and replicate over all status changes, config changes, downtime, comments, etc etc and then set the 'standby' Nagios to be checks/notifications disabled when in standby mode, and enabled when in active mode. Then put the two behind a failover load balancer (F5, Foundry or apache reverse proxy). We use rsync (run out of cron every minute) and a floating VIP between two hosts. Nagios is running on only one host at a time. It's a trivial (manual) process to switch between hosts. Files which are synced: all Nagios files except logs and transient results. Files synced include Nagios configs, binaries and CGIs, helper apps, plugins, local plugins and NRPE configs, docs, HTML files, status files, all files in ~nagios, and the crontab for user nagios. -- Jim -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null