[Nagios-users] Discussion: Nagios
*Hi Nagios-Users,* * * *Currently, I am working with Nagios where I have integrated it with a database platform (remote machines) to listen to the alerts and display them in the Nagios Web Interface. * *Nagios here runs on RHEL. The remote mahine sends SNMP trap messages (its an external device and not a box so no NRPE/SSH). I've setup SNMPTRAPD in the machine which captures the snmp messages from the box and calls Nagios command to route them to Nagios. **For this also, I have defined a trap service to manage the incoming traps from the remote machine. * * * *But, the problem is that only the topmost alert is displayed in the Nagios (in the log as well as in the Nagios Web UI). Is that like till the first one gets cleared the other alerts for the same service don't show up? The thing is that I need all the alerts sent from the remote machine to be sent under one service/host to Nagios.* * * *Any pointers regarding this will be much appreciated.* *Thank You.* * * *Regards,* *Divya.* -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Discussion: Nagios
Hi, In past I have configured snmp traps from network devices to display in the nagios UI. I have defined only one trap service under a network device which captures all the traps sent for this device in the service, so it will always show you the latest submitted trap/message and send out an alert based on if it's a Warning/Critical trap as may be defined by you in the snmptt config file or the integration script you used. Since any critical/warning alert logs a ticket on a ticketing system integrated with nagios, we are not so concerned to see all the alerts displayed in one service. In your case if you always want to display all the incoming traps to be displayed permanently you may need to define multiple trap service under that host and in the integration script you have to map different traps to the different services which you defined. But even in that case you might have defined and mapped a cpu trap service and a fan problem trap service under a host, so the cpu and fan trap will not display in the same service but guess if the new fan trap comes it will again override the old trap and show you in the nagios UI. But in this situation there also a chance to miss a trap or unknow trap which you may not have mapped. The other way which we are using is I defined a single trap service under a host and I used to reset it to OK after few seconds or minutes of the trap submission so by default it's always OK and once a trap comes it will display it, fire an alert and again rest to OK after few seconds. Hope it helps you some what :) On Thu, Jun 13, 2013 at 6:31 AM, Divya Raj divisb...@gmail.com wrote: *Hi Nagios-Users,* * * *Currently, I am working with Nagios where I have integrated it with a database platform (remote machines) to listen to the alerts and display them in the Nagios Web Interface. * *Nagios here runs on RHEL. The remote mahine sends SNMP trap messages (its an external device and not a box so no NRPE/SSH). I've setup SNMPTRAPD in the machine which captures the snmp messages from the box and calls Nagios command to route them to Nagios. **For this also, I have defined a trap service to manage the incoming traps from the remote machine. * * * *But, the problem is that only the topmost alert is displayed in the Nagios (in the log as well as in the Nagios Web UI). Is that like till the first one gets cleared the other alerts for the same service don't show up? The thing is that I need all the alerts sent from the remote machine to be sent under one service/host to Nagios.* * * *Any pointers regarding this will be much appreciated.* *Thank You.* * * *Regards,* *Divya.* -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Thanks Manish Kumar www.manishkr.com http://in.linkedin.com/in/manishkumar85 -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios init script not working on Ubuntu 12.04
Alternatively, do this to install nagios on ubuntu 12.04: # apt-get install nagios3 nagios-plugins The packages are quite good, and imho there is nothing quick about that quickstart guide. Kind regards, Pall Sigurdsson - Original Message - From: Gavin Grieve [DATACOM] gavin.gri...@datacom.co.nz To: Nagios Users List nagios-users@lists.sourceforge.net Sent: Wednesday, June 12, 2013 8:58:06 PM Subject: Re: [Nagios-users] Nagios init script not working on Ubuntu 12.04 You could try replacing the line that mentions functions with: . /lib/lsb/init-functions Note the full stop at the start is required. I believe this is the Ubuntu equivalent. -- Gavin Grieve Systems Management Specialist | Datacom | Datacom House, 68 Jervois Quay, Wellington, 6011, New Zealand www.datacom.co.nz | PO Box 6376, Marion Square, Wellington, New Zealand 6141 From: Daniel Wittenberg [mailto:dwittenberg2...@gmail.com] Sent: Thursday, 13 June 2013 7:30 a.m. To: Nagios Users List Subject: Re: [Nagios-users] Nagios init script not working on Ubuntu 12.04 Functions is a file on rhel-based systems and provides common start/stop routines. I dont use Ubuntu myself but if you figure out how to make it work let me know. If I get some time I'll see if I still have an ubuntu test VM to look at. Dan On Jun 12, 2013 2:09 PM, Abhinav Upadhyay er.abhinav.upadh...@gmail.com wrote: Hi, I just followed the instructions on http://nagios.sourceforge.net/docs/3_0/quickstart-ubuntu.html to install the latest stable release of Nagios (3.5) on a fresh Ubuntu 12.04 machine. Everything went fine, but when I try to start nagios using /etc/init.d/nagios start, I get following error: /etc/init.d/nagios: 20: .: Can't open /etc/rc.d/init.d/functions There is no file at /etc/rc.d/init.d/functions. It seems like the Makefile could not put the functions file at /etc/rc.d/init.d ? Is it a bug or I missed something? Regards Abhinav -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] How do I check ALL mount points using check_disk?
With check_disk normally you can specify something like -p /tmp -p /var which will check /tmp and /var. But if I want to check all of the mount points, should I not specify a partition (-p) variable at all? -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios 3.5.0 segfaulting at midnight
Sven Nierlein wrote: On 27.05.2013 09:50, Fournier, Wim wrote: Hi List, I've got 5 nagios installs, all on 3.5.0 and 3 they seem to segfault exactly at midnight. It's not all of them, but the busiest ones and not always. Has anyone else seen this? @ DEV what info would like if I file this as a bug? Hi Wim, Afaik there is a bug already. This is a known issue in combination with the livestatus neb module. You could wait for the next release or use the attached patch. Sven Anyone know if/when there will be a 3.5.1 release? Seems like there have been several good fixes made since 3.5.0 (including this one). -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] How do I check ALL mount points using check_disk?
You can configure it nrpe.conf command[check_disk_all]=/usr/libexec/nagios/plugins/check_disk -X nfs -X nfs3 -X nfs4 -X cifs -X none -X tmpfs -w $ARG1$ -c $ARG2$ Here we are not monitoring the nfs or cifs or tmpfs other than that we are will get results for everything On Thu, Jun 13, 2013 at 6:38 PM, Bonnie Rush bonnieru...@gmail.com wrote: With check_disk normally you can specify something like -p /tmp -p /var which will check /tmp and /var. But if I want to check all of the mount points, should I not specify a partition (-p) variable at all? -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Regards Sunil Sankar -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Issues with NEB modules breaking after restart
I recently just upgraded to the latest 3.5.0 release of nagios-core and just added livestatus into my environment. We are trying to replace NDO but currently have the two running at the same time along with NCPD for perfdata which as far as I know there shouldn't be an issues . The first issue I had was where Nagios would segfault every night during it's routine log rotation so I applied the 0007-fix_downtime_struct.dif patch which seems to have fixed that issue. I experienced a new issue this morning where when restarting Nagios none of the NEB modules uninitialized properly. Nagios was able to start and initialized all of the NEB modules but a few seconds later Nagios uninitialized them again. This isn't like anything I've seen before and none of the NEB modules worked after this occurred. Here is what the logs looked like. [Thu Jun 13 09:30:29 2013] Caught SIGTERM, shutting down... [Thu Jun 13 09:30:30 2013] Successfully shutdown... (PID=14098) [Thu Jun 13 09:30:31 2013] livestatus: Socket thread has terminated [Thu Jun 13 09:30:41 2013] Nagios 3.5.0 starting... (PID=481) [Thu Jun 13 09:30:41 2013] Local time is Thu Jun 13 09:30:41 EDT 2013 [Thu Jun 13 09:30:41 2013] LOG VERSION: 2.0 [Thu Jun 13 09:30:41 2013] livestatus: Livestatus 1.2.2p2 by Mathias Kettner. Socket: '/usr/local/nagios/var/rw/livestatus.sock' [Thu Jun 13 09:30:41 2013] livestatus: Please visit us at http://mathias-kettner.de/ [Thu Jun 13 09:30:41 2013] livestatus: Hint: please try out OMD - the Open Monitoring Distribution [Thu Jun 13 09:30:41 2013] livestatus: Please visit OMD at http://omdistro.org [Thu Jun 13 09:30:41 2013] livestatus: Removed old left over socket file /usr/local/nagios/var/rw/livestatus.sock [Thu Jun 13 09:30:41 2013] livestatus: archive path /drbd/r1/nagios/archives [Thu Jun 13 09:30:41 2013] livestatus: Finished initialization. Further log messages go to /drbd/r1/nagios/livestatus.log [Thu Jun 13 09:30:41 2013] Event broker module '/usr/local/mk-livestatus/livestatus.o' initialized successfully. [Thu Jun 13 09:30:41 2013] npcdmod: Copyright (c) 2008-2009 Hendrik Baecker (andu...@process-zero.de) - http://www.pnp4nagios.org [Thu Jun 13 09:30:41 2013] npcdmod: /usr/local/pnp4nagios/etc/npcd.cfg initialized [Thu Jun 13 09:30:41 2013] npcdmod: spool_dir = '/dev/shm/pnp4nagios/var/spool/'. [Thu Jun 13 09:30:41 2013] npcdmod: perfdata file '/dev/shm/pnp4nagios/var/perfdata.dump'. [Thu Jun 13 09:30:41 2013] npcdmod: Ready to run to have some fun! [Thu Jun 13 09:30:41 2013] livestatus: Timeperiod cache not updated, there are no timeperiods (yet) [Thu Jun 13 09:30:41 2013] Event broker module '/usr/local/pnp4nagios/lib64/npcdmod.o' initialized successfully. [Thu Jun 13 09:30:41 2013] ndomod: NDOMOD 1.5.2 (06-08-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors [Thu Jun 13 09:30:41 2013] ndomod: Successfully connected to data sink. 0 queued items to flush. [Thu Jun 13 09:30:41 2013] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully. [Thu Jun 13 09:30:43 2013] Finished daemonizing... (New PID=482) [Thu Jun 13 09:30:44 2013] TIMEPERIOD TRANSITION: 24x7;-1;1 [Thu Jun 13 09:30:47 2013] Event broker module '/usr/local/mk-livestatus/livestatus.o' deinitialized successfully. [Thu Jun 13 09:30:47 2013] npcdmod: If you don't like me, I will go out! Bye. [Thu Jun 13 09:30:47 2013] Event broker module '/usr/local/pnp4nagios/lib64/npcdmod.o' deinitialized successfully. [Thu Jun 13 09:30:47 2013] ndomod: Shutdown complete. [Thu Jun 13 09:30:47 2013] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully. Here is the next restart after this where things happened as I would expect: [Thu Jun 13 09:52:25 2013] Successfully shutdown... (PID=482) [Thu Jun 13 09:52:26 2013] livestatus: Socket thread has terminated [Thu Jun 13 09:52:26 2013] Event broker module '/usr/local/mk-livestatus/livestatus.o' deinitialized successfully. [Thu Jun 13 09:52:26 2013] npcdmod: If you don't like me, I will go out! Bye. [Thu Jun 13 09:52:26 2013] Event broker module '/usr/local/pnp4nagios/lib64/npcdmod.o' deinitialized successfully. [Thu Jun 13 09:52:26 2013] ndomod: Shutdown complete. [Thu Jun 13 09:52:26 2013] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully. [Thu Jun 13 09:52:29 2013] Nagios 3.5.0 starting... (PID=20081) [Thu Jun 13 09:52:29 2013] Local time is Thu Jun 13 09:52:29 EDT 2013 [Thu Jun 13 09:52:29 2013] LOG VERSION: 2.0 [Thu Jun 13 09:52:29 2013] livestatus: Livestatus 1.2.2p2 by Mathias Kettner. Socket: '/usr/local/nagios/var/rw/livestatus.sock' [Thu Jun 13 09:52:29 2013] livestatus: Please visit us at http://mathias-kettner.de/ [Thu Jun 13 09:52:29 2013] livestatus: Hint: please try out OMD - the Open Monitoring Distribution [Thu Jun 13 09:52:29 2013] livestatus: Please visit OMD at http://omdistro.org [Thu Jun 13 09:52:29 2013] livestatus: archive path /drbd/r1/nagios/archives [Thu Jun
Re: [Nagios-users] Discussion: Nagios
We've found that for SNMP trap services, we set the following: is_volatile1 # notify on every failure message, not just when going from OK to failure state max_check_attempts1 # notify on the first failure every time stalking_options o,w,c # log *all* OK, Warning and Critical messages to the Nagios log even if the state hasn't changed None of these will show all the traps in the host/service display however it will log all the states you have defined to the log file every time the host/service check output is received. Just a warning though - too many hosts/services being stalked can make the log file very big. -- Gavin Grieve Systems Management Specialist | Datacom | Datacom House, 68 Jervois Quay, Wellington, 6011, New Zealand www.datacom.co.nzhttp://www.datacom.co.nz/ | PO Box 6376, Marion Square, Wellington, New Zealand 6141 From: Manish Kumar [mailto:manikuma...@gmail.com] Sent: Thursday, 13 June 2013 9:09 p.m. To: Nagios Users List Subject: Re: [Nagios-users] Discussion: Nagios Hi, In past I have configured snmp traps from network devices to display in the nagios UI. I have defined only one trap service under a network device which captures all the traps sent for this device in the service, so it will always show you the latest submitted trap/message and send out an alert based on if it's a Warning/Critical trap as may be defined by you in the snmptt config file or the integration script you used. Since any critical/warning alert logs a ticket on a ticketing system integrated with nagios, we are not so concerned to see all the alerts displayed in one service. In your case if you always want to display all the incoming traps to be displayed permanently you may need to define multiple trap service under that host and in the integration script you have to map different traps to the different services which you defined. But even in that case you might have defined and mapped a cpu trap service and a fan problem trap service under a host, so the cpu and fan trap will not display in the same service but guess if the new fan trap comes it will again override the old trap and show you in the nagios UI. But in this situation there also a chance to miss a trap or unknow trap which you may not have mapped. The other way which we are using is I defined a single trap service under a host and I used to reset it to OK after few seconds or minutes of the trap submission so by default it's always OK and once a trap comes it will display it, fire an alert and again rest to OK after few seconds. Hope it helps you some what :) On Thu, Jun 13, 2013 at 6:31 AM, Divya Raj divisb...@gmail.commailto:divisb...@gmail.com wrote: Hi Nagios-Users, Currently, I am working with Nagios where I have integrated it with a database platform (remote machines) to listen to the alerts and display them in the Nagios Web Interface. Nagios here runs on RHEL. The remote mahine sends SNMP trap messages (its an external device and not a box so no NRPE/SSH). I've setup SNMPTRAPD in the machine which captures the snmp messages from the box and calls Nagios command to route them to Nagios. For this also, I have defined a trap service to manage the incoming traps from the remote machine. But, the problem is that only the topmost alert is displayed in the Nagios (in the log as well as in the Nagios Web UI). Is that like till the first one gets cleared the other alerts for the same service don't show up? The thing is that I need all the alerts sent from the remote machine to be sent under one service/host to Nagios. Any pointers regarding this will be much appreciated. Thank You. Regards, Divya. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.netmailto:Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Thanks Manish Kumar www.manishkr.comhttp://www.manishkr.com http://in.linkedin.com/in/manishkumar85 -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] reload appears to cause force of DOWN; SOFT; x to DOWN; HARD; 1
Running 3.4.1: I see this strange anomaly, where a host check is in the middle of doing retries before hitting max_attempts, but after a server reload occurs, the next check is automatically forced to DOWN;HARD;1, as seen here: [2013-06-04 08:40:21] HOST ALERT: 5gt4;DOWN;SOFT;1;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. [2013-06-04 08:47:18] HOST ALERT: 5gt4;DOWN;SOFT;2;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. [2013-06-04 08:54:03] HOST ALERT: 5gt4;DOWN;SOFT;3;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. (reload happens here) [2013-06-04 09:00:52] HOST ALERT: 5gt4;DOWN;HARD;1;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. Why is it skipping the rest of the attempts and going straight to DOWN;HARD after the reload ? Seems like a bug to me. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] reload appears to cause force of DOWN; SOFT; x to DOWN; HARD; 1
Do you have this in nagios.cfg? retain_state_information=1 On Thu, Jun 13, 2013 at 4:31 PM, Sean McKell mck...@us.ibm.com wrote: Running 3.4.1: I see this strange anomaly, where a host check is in the middle of doing retries before hitting max_attempts, but after a server reload occurs, the next check is automatically forced to DOWN;HARD;1, as seen here: [2013-06-04 08:40:21] HOST ALERT: 5gt4;DOWN;SOFT;1;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. [2013-06-04 08:47:18] HOST ALERT: 5gt4;DOWN;SOFT;2;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. [2013-06-04 08:54:03] HOST ALERT: 5gt4;DOWN;SOFT;3;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. (reload happens here) [2013-06-04 09:00:52] HOST ALERT: 5gt4;DOWN;HARD;1;CRITICAL: Connection timed out to '' after 160 seconds (user 'chk'). Expected prompt not found. Last output was ''. Why is it skipping the rest of the attempts and going straight to DOWN;HARD after the reload ? Seems like a bug to me. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null