[Nagios-users] nagios future?
Hi As somebody already heard about icinga http://www.icinga.org? Now, we planned to install nagios on a productiv server and nagios is very fine and we're happy with it. I just want to have your point of view about this new products and about the future of nagios? Best regards Jerome -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios future?
Meyer Jerome wrote: Hi Hi there! As somebody already heard about icinga http://www.icinga.org? Yes. It was discussed quite a lot some few weeks back on the nagios-devel mailing list. Browse the archives for the full discussion. Now, we planned to install nagios on a productiv server and nagios is very fine and we're happy with it. I just want to have your point of view about this new products and about the future of nagios? The future of Nagios is looking quite bright. In all honesty, that is in part thanks to the Icinga fork, which has sparked a flurry of activity within the Nagios developer community. First of all, we'll be releasing 3.1.1 soon, containing a plethora of bug- and performance fixes. Ethan's working on automating the release process so that Ton and I can cut releases without having to update a bunch of webpages, sourceforge downloads area, documentation, etc, etc. 3.1.1 will be the first live test of that automated process. If it drags out another week or so though, we'll probably just go ahead and do it manually anyway, as 3.1.1 really has a lot of important fixes that the Nagios users really should get their hands on. Nagios will get a new GUI, dubbed Ninja sometime during or after the summer. Ninja is available for download already and is usable but has some warts and is still incomplete according to Ninja maintainer Per Åsberg. You can find out more about it at http://www.op5.org/community/projects/ninja. This was announced at the Nordic Meet on Nagios which was held in Stockholm just last week. Note that it's not necessarily easy to install yet as it's still a work in progress. Bug-reports or enhancement requests are ofcourse very welcome, and documentation patches for the installation procedures even more so. Hope that answers your questions :-) -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Standard Nagios CGI Usage Documentation
So I take it documentation is not a big thing for people on this list ... I have started some documentation available at http://www.smartmon.com.au/docs/ The starting page for this specific part of the documentation is: http://www.smartmon.com.au/docs/tiki-index.php?page=Monitoring%20Operations%20%E2%80%93%20Using%20The%20Nagios%20Web%20Interfacestructure=User%20Guide I will be continuing to add to it and hope it helps someone. If you have any suggestions or submissions let me know. Matthew Jurgens wrote: Has anyone every come across some documentation that is aimed at new Nagios users that describes how to use the standard CGI interface, explains concepts of acknowledgements, downtime, etc etc? -- Smartmon System Monitoring http://www.smartmon.com.au www.smartmon.com.au -- OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Smartmon System Monitoring http://www.smartmon.com.au www.smartmon.com.au -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Recovery notifications after escalations
Hi, We have a a situation here where we would like notify the on-call group after 60 minutes and the support group after 240 minutes. If services go down and then recover, everyone who has received a notification of a host problem should also receive the recovery notification. See the configuration below. Now, our problem is that when the second escalation has been activated and the support group has received the notification, only the support group will receive the recovery notification - the on-call group will never see the recovery notification. We do not want to send out multiple notifications to the on-call group four the same issue since they then would be spammed by Nagios unnecessarily. define host{ namegeneric-host ... contactsroot ; This will be stored in a local mailbox that no one sees notification_interval 60 notification_optionsd,u,r register0 } # First escalation for on call group (notification after 60 minutes) define hostescalation{ host_name * first_notification 2 last_notification 2 notification_interval 180 contact_groups on-call } # Second escalation for support group (notification after 240 minutes) # Problem: when this escalation has been activated, on-call does not receive recovery notifications anymore # (we do not want to send multiple notifications about the same problem to on-call) define hostescalation{ host_name * first_notification 3 last_notification 3 notification_interval 0 contact_groups support } Is it possible to achieve what we want using escalations? Best regards, Ulf Karlsson -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_NRPE
Hi Eduardo, is the nrpe-daemon started and installed correctly on the host? Do you have set the ip address from your nagios server in the nrpe.cfg (allowed_host). Mit freundlichen Grüßen / With kind regards, Sebastian Gosenheimer Eduardo Barreto schrieb: HI, When try to check a service on a remote host, this message appears CHECK_NRPE: Error receiving data from daemon. What might it be? Does anybody knows what should I do? Thanks in advance Eduardo -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail sind nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios future?
Hi, Andreas Ericsson wrote the following on 09.06.2009 11:10: Yes. It was discussed quite a lot some few weeks back on the nagios-devel mailing list. Browse the archives for the full discussion. try http://sourceforge.net/mailarchive/message.php?msg_id=E03A84B43BAE443888372020DDBFE0E7%40int.consol.de If you're interested in reading a bit more, try http://sourceforge.net/mailarchive/forum.php?forum_name=icinga-users http://sourceforge.net/mailarchive/forum.php?forum_name=icinga-devel The future of Nagios is looking quite bright. In all honesty, that is in part thanks to the Icinga fork, which has sparked a flurry of activity within the Nagios developer community. And also popping up many community based sites, beside the existing ones. Not that bad, but a bit misleading for new users imho. But let's see how it resolves in a bit. Hopefully Nagios will be on GIT soon to merge knowledge from both projects together. Dunno what plans are going on concerning the NDO and other similar core parts but I think there's much potential to share ideas and kniowledge between Nagios and Icinga. First of all, we'll be releasing 3.1.1 soon, containing a plethora of bug- and performance fixes. Ethan's working on automating the release process so that Ton and I can cut releases without having to update a bunch of webpages, sourceforge downloads area, documentation, etc, etc. 3.1.1 will be the first live test of that automated process. If it drags out another week or so though, we'll probably just go ahead and do it manually anyway, as 3.1.1 really has a lot of important fixes that the Nagios users really should get their hands on. It would be great to mention that all even releases are stable while odd remains testing. On nagios.org 3.1.0 is only mentioned as latest version and after clicking the download link it is marked as testing - bit confusing, but not really a problem for experienced users. Nagios will get a new GUI, dubbed Ninja sometime during or after the summer. Ninja is available for download already and is usable but has some warts and is still incomplete according to Ninja maintainer Per Åsberg. You can find out more about it at http://www.op5.org/community/projects/ninja. This was announced at the Nordic Meet on Nagios which was held in Stockholm just last week. Note that it's not necessarily easy to install yet as it's still a work in progress. Bug-reports or enhancement requests are ofcourse very welcome, and documentation patches for the installation procedures even more so. By announcing Ninja as new GUI, the rumors get into Merlin for DB usage. I've read several posts about that but my question is, how far would that be realistic? For what I know Merlin uses the libdbi (just as modified IDOUtils for Icinga) so it would be possible to use different db types. Are there any plans to realize that? :-) Kind regards, Michael -- DI (FH) Michael Friedrich michael.friedr...@univie.ac.at Tel: +43 1 4277 14359 Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios -- ndo2db -- centreon
Thanks for all the help!!! Was able to get everything working. James I spoke a little bit too soon. Although a couple of the hosts showed up, that's all I can get to work. I've added more, and the nagios configs get updated, yet nagios doesn't show any changes. So I'm missing something on the nagios/configuration side. If I take a bracket out of one of the configs, then nagios won't restart, so I know it's reading these cfg files. In the files below, srv-xen02.mydomain.com and srv-xen03.mydomain.com are working, but none of the others are. If I remove one of these hosts (using centreon frontend) it removes it from the configs, nagios is restarted, but nagios is not updated. It still shows the same two hosts. Anyone know what I might be missing? Here are some of my configs. The contacts listed in these configs all exist. Thanks, James For example, my config files are at /etc/nagios # # cat hostgroups.cfg define hostgroup{ hostgroup_name Linux_Servers alias All linux servers members srv-xen02.mydomain.com, srv-xen03.mydomain.com, srv-xen04.mydomain.com, srv-xen05.mydomain.com } define hostgroup{ hostgroup_name MY_routers alias MY routers members SLW-E11.mydomain.com } # cat hosts.cfg define host{ namegeneric-host alias generic-host check_command check_host_alive max_check_attempts 5 active_checks_enabled 1 passive_checks_enabled 0 check_period24x7 contact_groups netcool, Supervisors notification_interval 0 notification_period 24x7 notification_optionsd,r notifications_enabled 0 register0 } define host{ nameServers-Linux use generic-host alias Linux Servers register0 } define host{ host_name srv-xen02.mydomain.com use Servers-Linux alias srv-xen02 address 192.168.4.152 hostgroups Linux_Servers check_command check_host_alive max_check_attempts 10 check_interval 1 active_checks_enabled 1 passive_checks_enabled 1 check_period24x7 obsess_over_host0 check_freshness 0 flap_detection_enabled 0 process_perf_data 0 retain_status_information 0 retain_nonstatus_information0 contact_groups netcool notification_interval 1 notification_period 24x7 notification_optionsd,u notifications_enabled 1 } define host{ host_name srv-xen03.mydomain.com use Servers-Linux alias srv-xen03 address 192.168.4.153 hostgroups Linux_Servers check_command check_host_alive max_check_attempts 10 check_interval 1 active_checks_enabled 0 passive_checks_enabled 0 check_period24x7 obsess_over_host0 check_freshness 0 flap_detection_enabled 0 process_perf_data 0 retain_status_information 0 retain_nonstatus_information0 contact_groups netcool notification_interval 1 notification_period 24x7 notification_optionsd,u notifications_enabled 1 } define host{ host_name srv-xen04.mydomain.com use Servers-Linux alias srv-xen04 address 192.168.4.154 hostgroups Linux_Servers check_command check_host_alive max_check_attempts 10 check_interval
[Nagios-users] disk IO for windows?
Anyone know of a plug-in or mechanism to log local disk I/O on windows? My nagios server is currently using check_nt to connect to windows hosts via nsclient++. I was hoping perhaps COUNTER has something buried within it to pull down this info. TIA -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios future?
Michael Friedrich wrote: Hi, Andreas Ericsson wrote the following on 09.06.2009 11:10: Yes. It was discussed quite a lot some few weeks back on the nagios-devel mailing list. Browse the archives for the full discussion. try http://sourceforge.net/mailarchive/message.php?msg_id=E03A84B43BAE443888372020DDBFE0E7%40int.consol.de If you're interested in reading a bit more, try http://sourceforge.net/mailarchive/forum.php?forum_name=icinga-users http://sourceforge.net/mailarchive/forum.php?forum_name=icinga-devel Thanks for those links. I'm far too lazy to look them up myself ;-) The future of Nagios is looking quite bright. In all honesty, that is in part thanks to the Icinga fork, which has sparked a flurry of activity within the Nagios developer community. And also popping up many community based sites, beside the existing ones. Not that bad, but a bit misleading for new users imho. But let's see how it resolves in a bit. Hopefully Nagios will be on GIT soon to merge knowledge from both projects together. Dunno what plans are going on concerning the NDO and other similar core parts but I think there's much potential to share ideas and kniowledge between Nagios and Icinga. Nagios will move to git when 3.2.0 is out the door. Ethan wants some time to manage patches and stuff like he's used to without having to learn another tool. I'm sure he'll curse himself for not switching sooner when he learns the benefits of git, but at least we're getting there. One of the annoying things about the icinga-fork though is that they've mainly done a lot of renaming and not so much actual patching. This will ofcourse merge cleanly but in an unsatisfactory way for Nagios. Messy, but certainly possible to work around. First of all, we'll be releasing 3.1.1 soon, containing a plethora of bug- and performance fixes. Ethan's working on automating the release process so that Ton and I can cut releases without having to update a bunch of webpages, sourceforge downloads area, documentation, etc, etc. 3.1.1 will be the first live test of that automated process. If it drags out another week or so though, we'll probably just go ahead and do it manually anyway, as 3.1.1 really has a lot of important fixes that the Nagios users really should get their hands on. It would be great to mention that all even releases are stable while odd remains testing. On nagios.org 3.1.0 is only mentioned as latest version and after clicking the download link it is marked as testing - bit confusing, but not really a problem for experienced users. Oh, right. I'd actually forgotten that. Nagios will get a new GUI, dubbed Ninja sometime during or after the summer. Ninja is available for download already and is usable but has some warts and is still incomplete according to Ninja maintainer Per Åsberg. You can find out more about it at http://www.op5.org/community/projects/ninja. This was announced at the Nordic Meet on Nagios which was held in Stockholm just last week. Note that it's not necessarily easy to install yet as it's still a work in progress. Bug-reports or enhancement requests are ofcourse very welcome, and documentation patches for the installation procedures even more so. By announcing Ninja as new GUI, the rumors get into Merlin for DB usage. I've read several posts about that but my question is, how far would that be realistic? Very realistic. We're already using it for development to that purpose, and it's working just fine. One problem with NDOUtils is that the database schema makes it impossible to write stuff for it that scale linearly. That's totally unacceptable for us, so we had to come up with something new. Fortunately, Lars Hjemli of the NagVis project has been very friendly and cooperative in helping us add support for the Merlin database schema in NagVis. Given how simple the Merlin schema is, I have no doubt that we'll provide patches to other projects to achieve the same thing. For what I know Merlin uses the libdbi (just as modified IDOUtils for Icinga) so it would be possible to use different db types. Are there any plans to realize that? :-) It's been planned, implemented, tested and available since 2009-03-17. Additional bugfixes happened later, but libdbi has been in use in Merlin almost three months now. I'm working (but very slowly) on some patches to address the multiple memory allocations required to use libdbi for quoting strings etc, since it prevents us from using a static arena to do the quoting etc in, but that will take a while to complete so we're living with that microscopic deficiency for now. $ git show 084cdc85 commit 084cdc85d7b0c8a4f721804476979e904e4afe7a Author: Andreas Ericsson a...@op5.se Date: Tue Mar 17 10:44:47 2009 +0100 Use libdbi for database abstraction In some ways it's worse, since we're now forced to allocate and deallocate a lot of memory for each request, but in other
[Nagios-users] Problems with a parameter when executing check_procs via check_by_ssh
Hi, I want to execute check_procs via check_by_ssh with the following command: ./check_by_ssh -H 172.24.1.70 -t 120 -C /usr/local/bin/check_procs -C zeiterf -c 1:1 -a './zeiterf -z' The result is the following error: Remote command execution failed: /usr/local/bin/check_procs: option requires an argument -- z The problem is the parameter -z because after removing it, I get the expected result. ./check_by_ssh -H 172.24.1.70 -t 120 -C /usr/local/bin/check_procs -C zeiterf -c 1:1 -a './zeiterf' PROCS CRITICAL: 2 processes with command name 'zeiterf', args './zeiterf' Does anyone know how to included the parameter correctly? Thanks for your help, Stefan -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] disk IO for windows?
dave stern - e-mail.pluribus.unum wrote: Anyone know of a plug-in or mechanism to log local disk I/O on windows? My nagios server is currently using check_nt to connect to windows hosts via nsclient++. I was hoping perhaps COUNTER has something buried within it to pull down this info. There are indeed counters for that, but due to Microsoft's stupidity the counter-names are different depending on which base-language you've used for your windows servers. I don't know what they're named for english platforms (or any other for that matter), but you should be able to view them with that thing you can pop up when pressing ctrl-alt-del (task manager or whatever it's called). -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Recovery notifications after escalations
On 06/09 12:06, Ulf Karlsson wrote: Hi, We have a a situation here where we would like notify the on-call group after 60 minutes and the support group after 240 minutes. If services go down and then recover, everyone who has received a notification of a host problem should also receive the recovery notification. See the configuration below. Now, our problem is that when the second escalation has been activated and the support group has received the notification, only the support group will receive the recovery notification - the on-call group will never see the recovery notification. We do not want to send out multiple notifications to the on-call group four the same issue since they then would be spammed by Nagios unnecessarily. I don't (at least not yet) have a good answer. But maybe I can put some ideas in your head. My first thought is that if they want the recovery notification maybe they would not mind the extra one either. The extra one actually tells them that the issue was escalated and might be useful information. If they don't want the issue to escalate, they should acknowledge it (sticky). In order do fix it to work like you asks I have two suggestions. None of them is good. If you do not have that many contacts, create an additional one for each member in the on-call with only recovery-alerts and put them in a group, e.g. on-call-recovery and escalate to that one. They will now get the recovery notification. An other alternative is to modify your notification-command to take notice of the macros $SERVICENOTIFICATIONNUMBER$ and maybe $HOSTNOTIFICATIONNUMBER$ and build the logic you wish. Make sure to do it right so you don't miss important notifications. But, as I said, I don't like any of the ideas. There are very smart people on this list and someone will probably give you some more advice. Regards, /Marcus -- Marcus Rejås jabber: mar...@jabber.rejas.se ,= ,-_-. =. Rejås Datakonsult e-mail: mar...@rejas.se((_/)o o(\_)) Kaserngatan 1 web: http://www.rejas.se `-'(. .)`-' s-761 46 Norrtäljegpg-key: http://gpg.rejas.se \_/ -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios future?
Hi, Andreas Ericsson wrote the following on 09.06.2009 15:09: Thanks for those links. I'm far too lazy to look them up myself ;-) So do I - Oracle makes me kind of crazy ;-) Nagios will move to git when 3.2.0 is out the door. Ethan wants some time to manage patches and stuff like he's used to without having to learn another tool. I'm sure he'll curse himself for not switching sooner when he learns the benefits of git, but at least we're getting there. Well some common aliases from cvs for git will help too ;-) I've been looking onto git for about 3 weeks and I like to use this cheatsheet a lot: http://ktown.kde.org/~zrusin/git/git-cheat-sheet-medium.png One of the annoying things about the icinga-fork though is that they've mainly done a lot of renaming and not so much actual patching. This will ofcourse merge cleanly but in an unsatisfactory way for Nagios. Messy, but certainly possible to work around. Yep that is true but to say Hey it's like Nagios but not the same all names had to be removed/changed. But concerning merging patches it shouldn't be that big problem. Current Nagios patches have been pulled over and merged into actual Icinga source. So backwards it should work then too. Very realistic. We're already using it for development to that purpose, and it's working just fine. One problem with NDOUtils is that the database schema makes it impossible to write stuff for it that scale linearly. That's totally unacceptable for us, so we had to come up with something new. Fortunately, Lars Hjemli of the NagVis project has been very friendly and cooperative in helping us add support for the Merlin database schema in NagVis. Given how simple the Merlin schema is, I have no doubt that we'll provide patches to other projects to achieve the same thing. Yeah i like that move because everyone is holding back on the DB schema of the NDO which is far too normalized and doesn't scale. And my biggest concern right now, Oracle limits table and column names to max 30 characters (varchar2(30)). Maybe you'll keep an eye on that while testing your schema. It's been planned, implemented, tested and available since 2009-03-17. Additional bugfixes happened later, but libdbi has been in use in Merlin almost three months now. Ok good to hear that - some query normalizations and other database specific stuff will pop up for sure. I've been hitting on the libdbi-driver for Oracle and it seems to work (connection using the IDOUtils to remote Oracle-Server). When everything works out I hope to push source for libdbi Oracle soon to Icinga IDOUtils. Even though IDO and Merlin are different, but I think hope libdbi knowledge can be shared in this case :) Kind regards, Michael I'm working (but very slowly) on some patches to address the multiple memory allocations required to use libdbi for quoting strings etc, since it prevents us from using a static arena to do the quoting etc in, but that will take a while to complete so we're living with that microscopic deficiency for now. $ git show 084cdc85 commit 084cdc85d7b0c8a4f721804476979e904e4afe7a Author: Andreas Ericsson a...@op5.se Date: Tue Mar 17 10:44:47 2009 +0100 Use libdbi for database abstraction In some ways it's worse, since we're now forced to allocate and deallocate a lot of memory for each request, but in other ways it's pure win as we can now let users use whatever database type they want. Signed-off-by: Andreas Ericsson a...@op5.se -- DI (FH) Michael Friedrich michael.friedr...@univie.ac.at Tel: +43 1 4277 14359 Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Recovery notifications after escalations
Marcus Rejås wrote: On 06/09 12:06, Ulf Karlsson wrote: Hi, We have a a situation here where we would like notify the on-call group after 60 minutes and the support group after 240 minutes. If services go down and then recover, everyone who has received a notification of a host problem should also receive the recovery notification. See the configuration below. Now, our problem is that when the second escalation has been activated and the support group has received the notification, only the support group will receive the recovery notification - the on-call group will never see the recovery notification. We do not want to send out multiple notifications to the on-call group four the same issue since they then would be spammed by Nagios unnecessarily. I don't (at least not yet) have a good answer. But maybe I can put some ideas in your head. My first thought is that if they want the recovery notification maybe they would not mind the extra one either. The extra one actually tells them that the issue was escalated and might be useful information. If they don't want the issue to escalate, they should acknowledge it (sticky). In order do fix it to work like you asks I have two suggestions. None of them is good. If you do not have that many contacts, create an additional one for each member in the on-call with only recovery-alerts and put them in a group, e.g. on-call-recovery and escalate to that one. They will now get the recovery notification. I don't think they will. There are checks to make sure recovery notifications are only sent to contacts who have received the previous problem notification. An other alternative is to modify your notification-command to take notice of the macros $SERVICENOTIFICATIONNUMBER$ and maybe $HOSTNOTIFICATIONNUMBER$ and build the logic you wish. Make sure to do it right so you don't miss important notifications. But, as I said, I don't like any of the ideas. There are very smart people on this list and someone will probably give you some more advice. Sending a patch to make sure each problem object in the Nagios core contains a concatenated list of normal and escalated contacts would be favourite, since that would mean everyone who received the problem notification will also get the recovery notification. This would best be implemented by building a linked list with only unique elements to operate on. The list should probably contain a marker to mention which contacts were added from the escalation, so the original contacts do not get notified if they don't want to get the escalated notifications. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios future?
Michael Friedrich wrote: Andreas Ericsson wrote the following on 09.06.2009 15:09: Nagios will move to git when 3.2.0 is out the door. Ethan wants some time to manage patches and stuff like he's used to without having to learn another tool. I'm sure he'll curse himself for not switching sooner when he learns the benefits of git, but at least we're getting there. Well some common aliases from cvs for git will help too ;-) I've been looking onto git for about 3 weeks and I like to use this cheatsheet a lot: http://ktown.kde.org/~zrusin/git/git-cheat-sheet-medium.png Ugh. I absolutely hate that, because it just tells you do this, do that but doesn't explain *why*. It never mentions why the index is there, or how you can use it when you run into stuff that's actually *hard*, such as an 8-way merge that suddenly went wahoonie-shaped. But to each his own, I guess. One of the annoying things about the icinga-fork though is that they've mainly done a lot of renaming and not so much actual patching. This will ofcourse merge cleanly but in an unsatisfactory way for Nagios. Messy, but certainly possible to work around. Yep that is true but to say Hey it's like Nagios but not the same all names had to be removed/changed. But concerning merging patches it shouldn't be that big problem. Current Nagios patches have been pulled over and merged into actual Icinga source. So backwards it should work then too. It has? I'll have to take a look at that, I think. The hard part will be to separate the cruft from the code, so that only the real changes appear in a diff. Some simple sed magic will probably do the trick though. Very realistic. We're already using it for development to that purpose, and it's working just fine. One problem with NDOUtils is that the database schema makes it impossible to write stuff for it that scale linearly. That's totally unacceptable for us, so we had to come up with something new. Fortunately, Lars Hjemli of the NagVis project has been very friendly and cooperative in helping us add support for the Merlin database schema in NagVis. Given how simple the Merlin schema is, I have no doubt that we'll provide patches to other projects to achieve the same thing. Yeah i like that move because everyone is holding back on the DB schema of the NDO which is far too normalized and doesn't scale. And my biggest concern right now, Oracle limits table and column names to max 30 characters (varchar2(30)). Maybe you'll keep an eye on that while testing your schema. I haven't actually thought about it. A quick glance reveals that the serviceescalation_contactgroup junction table is the one with the longest name, weighing in at 31 characters. That can be fixed quite easily though, since junction table names are determined by a function which can easily special-case this particular one. It's been planned, implemented, tested and available since 2009-03-17. Additional bugfixes happened later, but libdbi has been in use in Merlin almost three months now. Ok good to hear that - some query normalizations and other database specific stuff will pop up for sure. I've been hitting on the libdbi-driver for Oracle and it seems to work (connection using the IDOUtils to remote Oracle-Server). When everything works out I hope to push source for libdbi Oracle soon to Icinga IDOUtils. Even though IDO and Merlin are different, but I think hope libdbi knowledge can be shared in this case :) Since libdbi provides a database-agnostic api (it would be quite useless if it didn't), a simple thing such as loading the correct driver should suffice to make it work with Merlin as well. Which driver to use can be specified in the Merlin configuration file. However, there's currently no oracle driver for Kohana that I'm aware of, and that means Ninja won't be able to benefit from an Oracle database even if Merlin can write to it. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios monitoring for hack
I'd look into the various hardening and monitoring tools available (Bastille, Tripwire, chroot, etc). There's different tools for different purposes, obviously. We chroot all our BIND and Apache stuff. Bastille is great for hardening the environment. Tripwire monitors for changes to key files. Each program has its own logging mechanisms. So once you have your tool in place, you can use Nagios to watch the log file(s) and generate alerts based on keywords (ALERT, WARN, CRIT, etc). You can also dump your logs to an alternate server and have Nagios watch them from there, but in the case of DDoS attack, your bandwidth may be affected for remote syslog and/or Nagios network checks. A. Davis Email: ncc...@gmail.com There is no limit to what a man can accomplish if he doesn't care who gets the credit. - Ronald Reagan shadih rahman wrote: our web sites got hacked and we were subjected to ddos for last few days. I wanted to know what can I do for monitoring to find out if I am hacked or not. By the way, we were hacked by php exploits. Please advise on this. Thanks -- Cordially, Shadhin Rahman -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] ndo utils question
All, I have been running ndoutils with nagios for a while. When I initially setup my nagios, I played around with a lot of different service checks and changed around a lot of config parameters. Now, I have a solid setup and I have not changed configuration for a while. When I go into the database and look at nagios_objects tables, I see all sorts of old objects which do not exist in my current setup. Does ndoutils clean up and throw away old config when we start nagios? Please advise on this. Thanks -- Cordially, Shadhin Rahman -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] check_hpasm (3.5) problem on RHEL3
I have check_hpasm installed successfully on RHEL3 but I am unable to get it to work (I have it working correctly for me on RHEL4 and 5). Here is the problem that I face with RHEL3: [nag...@mysys nagios]$ /usr/local/nagios/libexec/check_nrpe -H localhost -c check_hpasm UNKNOWN - insufficient rights to call /usr/sbin/hpacucli [nag...@mysys nagios]$ sudo /usr/local/nagios/libexec/check_hpasm -v UNKNOWN - insufficient rights to call /usr/sbin/hpacucli My /etc/sudoers has nagios ALL=NOPASSWD:/sbin/hpasmcli,/usr/sbin/hpacucli,/usr/local/nagios/libexec /check_hpasm Calling /usr/bin/hpacucli works correctly using sudo: [nag...@mysys nagios]$ sudo /usr/sbin/hpacucli HP Array Configuration Utility CLI 7.40.7.0 Detecting Controllers...Done. Type help for a list of supported commands. Type exit to close the console. = [nag...@mysys nagios]$ sudo /usr/sbin/hpacucli -s help To enter the ACU CLI console type: hpacucli Commands can also be executed from outside the ACU CLI console using the syntax: hpacucli target command [param[=value]] All targets, commands, parameters, and values must be entered in lowercase. The only exceptions to this are user-specified names, such as chassisname. target command [param[=value]] target is of format: [controller all|slot=#|wwn=#|chassisname=AAA] [array all|id] [physicaldrive all|#:#:#|allunassigned] [logicaldrive all|#] Note: The first # in physicaldrive is only needed for systems that specify port:box:bay. Other physical drive targeting schemes are box:bay and port:id. Example targets: controller all controller slot=5 controller chassisname=Lab C controller serialnumber=P21DA2322S controller wwn=500308B300701011 controller slot=1 array all controller slot=7 array A ctrl slot=1 pd allunassigned controller slot=2 logicaldrive all controller slot=5 ld 5 controller slot=5 physicaldrive 1:5 controller slot=5 physicaldrive 1E:2:3 command can be create,delete,modify,show,rescan For detailed command information type any of the following: help add help create help delete help modify help remove help shorthand help show help target help rescan What else could I try? Thanks! Charanbeer This email is intended only for the named person or entity to which it is addressed and contains valuable business information that is proprietary, privileged, confidential and/or otherwise protected from disclosure. Dissemination, distribution or copying of this email or the information herein by anyone other than the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, is prohibited. If you have received this email by mistake, please delete it from your system immediately and notify the sender. Email transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of email transmission. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] DNS down and false alerts...
I've observed an interesting issue with Nagios. Our environment is a mix of UNIX, Linux, Apple, and Windows. The core of the network is Active Directory including two AD servers that are both our primary, internal DNS servers. All non-Windows systems have a resolv.conf that looks like: *nameserver 10.1.1.13 nameserver 10.1.1.14 domain int.our.domain search int.our.domain* About half of the servers have the nameserver entries inverted (ie: .14 first, .13 second). The issue is that anytime one of the nameservers is rebooted (at least once a month if staying current on patches thanks to Black Tuesdays), whichever hosts have that nameserver listed first in its resolv.conf start throwing the following errors: *CRITICAL - Plugin timed out while executing system call.* This occurs for multiple tests for each host. Obviously, there's a name resolution correlation here. If the nameserver with .13 is rebooted, all hosts (about half of them) that list this IP first in their resolve.conf then timeout for multiple tests. If the .14 server is rebooted, all the other hosts timeout. Interestingly, none of the Windows clients issue errors... only UNIX, Linux, and Mac's... only those with an /etc/resolv.conf. The end result is a host of false positives, but more importantly it looks bad on availability reports and causes phones/pagers to go ballistic with unneeded emails. I'm trying to find a solution and I can't find one that I like: Solution 1) is to cluster the DNS servers. We have lots of clusters here. This isn't good, though, as you don't normally cluster DNS servers... they're meant to be redundant for a reason... one fails and it uses the next one. Solution 2) is to setup a service/host dependency. My thought would be either a host dependency that says if either .13 or .14 are down, then don't alert for any other host that uses them. Or a service to host dependency... if the DNS service is down, then don't alert on any of these dependent hosts. Honestly, I'm not sure if you can mix host and service dependencies like this... plus... if the DNS server is actually down, then the DNS service is down, so better to use a host dependency. The problem is that now we're not alerting on any dependent hosts which themselves could have a legitimate issue we want to know about. Plus, what happens if the DNS server actually dies and take a few hours/days to rebuild/restore? At this point, the dependent hosts aren't watched for a very long time. Solution 3) is to setup a UNIX/Linux DNS server that slaves all zones from the AD servers and have all UNIX/Linux/Apple clients query from this server. This would work except that A) I need two of them to keep redundancy and B) I've now added an extra layer of complication to resolve an application (Nagios)... not exactly good practice. Solution 4) is to set the timeout value of a host querying a DNS server. Perhaps adjust the client to timeout on the first listed nameserver after only 10 seconds, then try the next one? Since most Nagios tests have a minimum timeout value of 30 seconds, if the first DNS query timed out after 10 seconds, it would go to the next one with, hopefully, enough time to respond. The downside is having to adjust every single server. Has anyone else seen this? Anyone else using Windows AD servers to provide DNS for *nix servers? -- A. Davis Email: ncc...@gmail.com There is no limit to what a man can accomplish if he doesn't care who gets the credit. - Ronald Reagan -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Log2ndo Not Placing Historical Logs into DB
My old logs arent going into the DB. I am running as follows: ./log2ndo -s /usr/local/nagios/var/archives/*.log -d /usr/local/nagios/var/ndo.sock -i default -t unix -p 5668 but nothing is going in to the db, i see db connections successful in log and disconnect successful but nothing is being entered. Running single instance of nagios, default setup of db. NDOUtils 1.47b and Nagios 3.1.0. -Derek -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] DNS down and false alerts...
Option 5: Install a local caching DNS server on your nagios box, and put 127.0.0.1 at the top of resolv.conf. Cheers, Phil -- Phil Randal | Networks Engineer Herefordshire Council | Deputy Chief Executive's Office | I.C.T. Services Division Thorn Office Centre, Rotherwas, Hereford, HR2 6JT Tel: 01432 260160 email: pran...@herefordshire.gov.uk Any opinion expressed in this e-mail or any attached files are those of the individual and not necessarily those of Herefordshire Council. This e-mail and any attached files are confidential and intended solely for the use of the addressee. This communication may contain material protected by law from being passed on. If you are not the intended recipient and have received this e-mail in error, you are advised that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. If you have received this e-mail in error please contact the sender immediately and destroy all copies of it. From: Andrew Davis [mailto:ncc...@gmail.com] Sent: 09 June 2009 16:19 To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] DNS down and false alerts... I've observed an interesting issue with Nagios. Our environment is a mix of UNIX, Linux, Apple, and Windows. The core of the network is Active Directory including two AD servers that are both our primary, internal DNS servers. All non-Windows systems have a resolv.conf that looks like: nameserver 10.1.1.13 nameserver 10.1.1.14 domain int.our.domain search int.our.domain About half of the servers have the nameserver entries inverted (ie: .14 first, .13 second). The issue is that anytime one of the nameservers is rebooted (at least once a month if staying current on patches thanks to Black Tuesdays), whichever hosts have that nameserver listed first in its resolv.conf start throwing the following errors: CRITICAL - Plugin timed out while executing system call. This occurs for multiple tests for each host. Obviously, there's a name resolution correlation here. If the nameserver with .13 is rebooted, all hosts (about half of them) that list this IP first in their resolve.conf then timeout for multiple tests. If the .14 server is rebooted, all the other hosts timeout. Interestingly, none of the Windows clients issue errors... only UNIX, Linux, and Mac's... only those with an /etc/resolv.conf. The end result is a host of false positives, but more importantly it looks bad on availability reports and causes phones/pagers to go ballistic with unneeded emails. I'm trying to find a solution and I can't find one that I like: Solution 1) is to cluster the DNS servers. We have lots of clusters here. This isn't good, though, as you don't normally cluster DNS servers... they're meant to be redundant for a reason... one fails and it uses the next one. Solution 2) is to setup a service/host dependency. My thought would be either a host dependency that says if either .13 or .14 are down, then don't alert for any other host that uses them. Or a service to host dependency... if the DNS service is down, then don't alert on any of these dependent hosts. Honestly, I'm not sure if you can mix host and service dependencies like this... plus... if the DNS server is actually down, then the DNS service is down, so better to use a host dependency. The problem is that now we're not alerting on any dependent hosts which themselves could have a legitimate issue we want to know about. Plus, what happens if the DNS server actually dies and take a few hours/days to rebuild/restore? At this point, the dependent hosts aren't watched for a very long time. Solution 3) is to setup a UNIX/Linux DNS server that slaves all zones from the AD servers and have all UNIX/Linux/Apple clients query from this server. This would work except that A) I need two of them to keep redundancy and B) I've now added an extra layer of complication to resolve an application (Nagios)... not exactly good practice. Solution 4) is to set the timeout value of a host querying a DNS server. Perhaps adjust the client to timeout on the first listed nameserver after only 10 seconds, then try the next one? Since most Nagios tests have a minimum timeout value of 30 seconds, if the first DNS query timed out after 10 seconds, it would go to the next one with, hopefully, enough time to respond. The downside is having to adjust every single server. Has anyone else seen this? Anyone else using Windows AD servers to provide DNS for *nix servers? -- A. Davis Email: ncc...@gmail.com There is no limit to what a man can accomplish if he doesn't care who gets the credit. - Ronald Reagan -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for
[Nagios-users] Synchronizing turning off notifications.
We have two Nagios servers with one acting as a fallback. We run a sync program every time there is an update. This sync copies the config files from mon1 to mon2, stops and restarts the backup. Works well, but the problem we're having is that it is not syncing when we turn notification off on mon1. If it falls back to mon2 it will page for that device. Does anyone know where Nagios stores the notification=off option when it is changed via the web interface? I suspect we're missing the copy of that file. Thanks, Michael Lucker -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios future?
Andreas Ericsson wrote the following on 09.06.2009 16:33: It has? I'll have to take a look at that, I think. The hard part will be to separate the cruft from the code, so that only the real changes appear in a diff. Some simple sed magic will probably do the trick though. Would be bad if it hasn't - as long as both projects are mainly compatible they can profit from each other. GIT is really nice for that but you should ask Hendrik instead how to deal with that :-) https://git.icinga.org/index?p=icinga-core.git;a=summary I haven't actually thought about it. A quick glance reveals that the serviceescalation_contactgroup junction table is the one with the longest name, weighing in at 31 characters. That can be fixed quite easily though, since junction table names are determined by a function which can easily special-case this particular one. Mh thanks for the tip I need to think about that in more deep. Since libdbi provides a database-agnostic api (it would be quite useless if it didn't), a simple thing such as loading the correct driver should suffice to make it work with Merlin as well. Which driver to use can be specified in the Merlin configuration file. However, there's currently no oracle driver for Kohana that I'm aware of, and that means Ninja won't be able to benefit from an Oracle database even if Merlin can write to it. The thing which I am missing in libdbi-implementation is parameter binding which really is a performance tweak with lots of queries with different values. Another headache but maybe I'll hack that and send a patch to the developers. Mostly it is important for Oracle meanwhile. About Kohana - had a short look into the Database drivers. Oracle support won't be that big problem to implement but that won't be me. Hopefully it will be done because then Ninja and Merlin combined to Nagios would be an option alternatively to Icinga with optimized IDO for Oracle (which is my main task right now). Kind regards, Michael -- DI (FH) Michael Friedrich michael.friedr...@univie.ac.at Tel: +43 1 4277 14359 Vienna University Computer Center Universitaetsstrasse 7 A-1010 Vienna, Austria -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] DNS down and false alerts...
Really the best choice is to using caching DNS on the Nagios server. I'd recommend dnsmasq, it just does caching locally without needing to do big zone transfers. It has low overhead and simple configuration as a result. Enjoy. On Tue, Jun 09, 2009 at 11:19:20AM -0400, Andrew Davis wrote: I've observed an interesting issue with Nagios. Our environment is a mix of UNIX, Linux, Apple, and Windows. The core of the network is Active Directory including two AD servers that are both our primary, internal DNS servers. All non-Windows systems have a resolv.conf that looks like: *nameserver 10.1.1.13 nameserver 10.1.1.14 domain int.our.domain search int.our.domain* About half of the servers have the nameserver entries inverted (ie: .14 first, .13 second). The issue is that anytime one of the nameservers is rebooted (at least once a month if staying current on patches thanks to Black Tuesdays), whichever hosts have that nameserver listed first in its resolv.conf start throwing the following errors: *CRITICAL - Plugin timed out while executing system call.* This occurs for multiple tests for each host. Obviously, there's a name resolution correlation here. If the nameserver with .13 is rebooted, all hosts (about half of them) that list this IP first in their resolve.conf then timeout for multiple tests. If the .14 server is rebooted, all the other hosts timeout. Interestingly, none of the Windows clients issue errors... only UNIX, Linux, and Mac's... only those with an /etc/resolv.conf. The end result is a host of false positives, but more importantly it looks bad on availability reports and causes phones/pagers to go ballistic with unneeded emails. I'm trying to find a solution and I can't find one that I like: Solution 1) is to cluster the DNS servers. We have lots of clusters here. This isn't good, though, as you don't normally cluster DNS servers... they're meant to be redundant for a reason... one fails and it uses the next one. Solution 2) is to setup a service/host dependency. My thought would be either a host dependency that says if either .13 or .14 are down, then don't alert for any other host that uses them. Or a service to host dependency... if the DNS service is down, then don't alert on any of these dependent hosts. Honestly, I'm not sure if you can mix host and service dependencies like this... plus... if the DNS server is actually down, then the DNS service is down, so better to use a host dependency. The problem is that now we're not alerting on any dependent hosts which themselves could have a legitimate issue we want to know about. Plus, what happens if the DNS server actually dies and take a few hours/days to rebuild/restore? At this point, the dependent hosts aren't watched for a very long time. Solution 3) is to setup a UNIX/Linux DNS server that slaves all zones from the AD servers and have all UNIX/Linux/Apple clients query from this server. This would work except that A) I need two of them to keep redundancy and B) I've now added an extra layer of complication to resolve an application (Nagios)... not exactly good practice. Solution 4) is to set the timeout value of a host querying a DNS server. Perhaps adjust the client to timeout on the first listed nameserver after only 10 seconds, then try the next one? Since most Nagios tests have a minimum timeout value of 30 seconds, if the first DNS query timed out after 10 seconds, it would go to the next one with, hopefully, enough time to respond. The downside is having to adjust every single server. Has anyone else seen this? Anyone else using Windows AD servers to provide DNS for *nix servers? -- A. Davis Email: ncc...@gmail.com There is no limit to what a man can accomplish if he doesn't care who gets the credit. - Ronald Reagan -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Russell Adamsrlad...@adamsinfoserv.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint:1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 -- Crystal Reports - New Free
Re: [Nagios-users] Synchronizing turning off notifications.
On Jun 9, 2009, at 10:55 AM, Mike lucker wrote: We have two Nagios servers with one acting as a fallback. We run a sync program every time there is an update. This sync copies the config files from mon1 to mon2, stops and restarts the backup. Works well, but the problem we're having is that it is not syncing when we turn notification off on mon1. If it falls back to mon2 it will page for that device. Does anyone know where Nagios stores the notification=off option when it is changed via the web interface? I suspect we're missing the copy of that file. It's stored in memory and periodically written out the the retention file and status file based on your schedule or on shutdown. I haven't tried it but I'd suggest shutting down nagios on the master to ensure that the retention file is up-to-date, shut down the backup, rsync and restart both. The retention file is the one you want as the status file is recreated on startup based on config+retention.dat. -- Marc -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Synchronizing turning off notifications.
From: Mike lucker [mailto:mike.luc...@gmail.com] Sent: Tuesday, June 09, 2009 11:56 AM To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] Synchronizing turning off notifications. We have two Nagios servers with one acting as a fallback. We run a sync program every time there is an update. This sync copies the config files from mon1 to mon2, stops and restarts the backup. Works well, but the problem we're having is that it is not syncing when we turn notification off on mon1. If it falls back to mon2 it will page for that device. Does anyone know where Nagios stores the notification=off option when it is changed via the web interface? I suspect we're missing the copy of that file. Thanks, Michael Lucker Michael, This information is stored in the status.dat file as part of the running configuration from mon1. If mon2 is actually running, I'd recommend sending an external command to that running instance to disable notifications. See http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=7 You could then run the enable notifications command at some point when you need it to act as the real server http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=8 Mark -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] DNS down and false alerts...
On Jun 9, 2009, at 10:42 AM, Randal, Phil wrote: Option 5: Install a local caching DNS server on your nagios box, and put 127.0.0.1 at the top of resolv.conf. My reading of the issue, and I believe that I've seen it in the past as well, is that the problem isn't with DNS resolution on the nagios box but DNS resolution happening on the target boxes. Installing a caching nameserver on the nagios box isn't going to help any. The target system is trying to do a DNS lookup on the connecting host (nagios). The OP isn't specific on how he's checking these boxes so it could be xinetd, nrpe, whatever... The default timeout for DNS server failure detection in the resolver libraries is too long so the plugin times out. I'd personally look at changing that timeout and rotation between servers in resolv.conf (options timeout:x rotate). -- Marc -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] DNS down and false alerts...
I don't know if I'm misreading the OP, but if the plugins start timing out on only the boxes whose primary DNS is being rebooted, would adding a caching DNS server to the Nagios box really make a difference? I think the root cause to these timeouts is that the Nagios plugin timeout is happening before the connection to the primary DNS on the target machine has a chance to time out and then connect to the secondary DNS. The correct course of action to resolve this would be to either make sure that the DNS connection on the target machines fail quicker, or that Nagios/the plugin waits longer for a result from the check. The DNS failover is working as designed here but you're not giving it enough time to kick in. On Tue, Jun 9, 2009 at 5:37 PM, Russell Adams rlad...@adamsinfoserv.comwrote: Really the best choice is to using caching DNS on the Nagios server. I'd recommend dnsmasq, it just does caching locally without needing to do big zone transfers. It has low overhead and simple configuration as a result. Enjoy. On Tue, Jun 09, 2009 at 11:19:20AM -0400, Andrew Davis wrote: I've observed an interesting issue with Nagios. Our environment is a mix of UNIX, Linux, Apple, and Windows. The core of the network is Active Directory including two AD servers that are both our primary, internal DNS servers. All non-Windows systems have a resolv.conf that looks like: *nameserver 10.1.1.13 nameserver 10.1.1.14 domain int.our.domain search int.our.domain* About half of the servers have the nameserver entries inverted (ie: .14 first, .13 second). The issue is that anytime one of the nameservers is rebooted (at least once a month if staying current on patches thanks to Black Tuesdays), whichever hosts have that nameserver listed first in its resolv.conf start throwing the following errors: *CRITICAL - Plugin timed out while executing system call.* This occurs for multiple tests for each host. Obviously, there's a name resolution correlation here. If the nameserver with .13 is rebooted, all hosts (about half of them) that list this IP first in their resolve.conf then timeout for multiple tests. If the .14 server is rebooted, all the other hosts timeout. Interestingly, none of the Windows clients issue errors... only UNIX, Linux, and Mac's... only those with an /etc/resolv.conf. The end result is a host of false positives, but more importantly it looks bad on availability reports and causes phones/pagers to go ballistic with unneeded emails. I'm trying to find a solution and I can't find one that I like: Solution 1) is to cluster the DNS servers. We have lots of clusters here. This isn't good, though, as you don't normally cluster DNS servers... they're meant to be redundant for a reason... one fails and it uses the next one. Solution 2) is to setup a service/host dependency. My thought would be either a host dependency that says if either .13 or .14 are down, then don't alert for any other host that uses them. Or a service to host dependency... if the DNS service is down, then don't alert on any of these dependent hosts. Honestly, I'm not sure if you can mix host and service dependencies like this... plus... if the DNS server is actually down, then the DNS service is down, so better to use a host dependency. The problem is that now we're not alerting on any dependent hosts which themselves could have a legitimate issue we want to know about. Plus, what happens if the DNS server actually dies and take a few hours/days to rebuild/restore? At this point, the dependent hosts aren't watched for a very long time. Solution 3) is to setup a UNIX/Linux DNS server that slaves all zones from the AD servers and have all UNIX/Linux/Apple clients query from this server. This would work except that A) I need two of them to keep redundancy and B) I've now added an extra layer of complication to resolve an application (Nagios)... not exactly good practice. Solution 4) is to set the timeout value of a host querying a DNS server. Perhaps adjust the client to timeout on the first listed nameserver after only 10 seconds, then try the next one? Since most Nagios tests have a minimum timeout value of 30 seconds, if the first DNS query timed out after 10 seconds, it would go to the next one with, hopefully, enough time to respond. The downside is having to adjust every single server. Has anyone else seen this? Anyone else using Windows AD servers to provide DNS for *nix servers? -- A. Davis Email: ncc...@gmail.com There is no limit to what a man can accomplish if he doesn't care who gets the credit. - Ronald Reagan -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report
Re: [Nagios-users] nagios future?
Michael Friedrich wrote: Andreas Ericsson wrote the following on 09.06.2009 16:33: It has? I'll have to take a look at that, I think. The hard part will be to separate the cruft from the code, so that only the real changes appear in a diff. Some simple sed magic will probably do the trick though. Would be bad if it hasn't - as long as both projects are mainly compatible they can profit from each other. Right. I'll have to revisit it and see what's new. GIT is really nice for that I know. I helped write it after all :p but you should ask Hendrik instead how to deal with that :-) I think I'll find a way. Thanks for the tip though. I haven't actually thought about it. A quick glance reveals that the serviceescalation_contactgroup junction table is the one with the longest name, weighing in at 31 characters. That can be fixed quite easily though, since junction table names are determined by a function which can easily special-case this particular one. Mh thanks for the tip I need to think about that in more deep. Well, it'll be ambiguous even if one char is stripped from it, so just cutting the name at 30 chars might be worthwhile if we're on oracle. Since libdbi provides a database-agnostic api (it would be quite useless if it didn't), a simple thing such as loading the correct driver should suffice to make it work with Merlin as well. Which driver to use can be specified in the Merlin configuration file. However, there's currently no oracle driver for Kohana that I'm aware of, and that means Ninja won't be able to benefit from an Oracle database even if Merlin can write to it. The thing which I am missing in libdbi-implementation is parameter binding which really is a performance tweak with lots of queries with different values. Another headache but maybe I'll hack that and send a patch to the developers. Mostly it is important for Oracle meanwhile. About Kohana - had a short look into the Database drivers. Oracle support won't be that big problem to implement but that won't be me. Hopefully it will be done because then Ninja and Merlin combined to Nagios would be an option alternatively to Icinga with optimized IDO for Oracle (which is my main task right now). Well, unless I'm mistaken IDO will have the same database layout as NDO, so it will still suck for performance, and writing good queries for it will still be a major headache. When the storage model of the algorithm algorithm is broken, tweaking it doesn't really help. Only a rewrite can save you then. It would be neat to see Merlin adapted to Oracle though, so if you're interested in working on that we'd sure help as much as we can. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] ndo utils question
shadih rahman wrote: All, I have been running ndoutils with nagios for a while. When I initially setup my nagios, I played around with a lot of different service checks and changed around a lot of config parameters. Now, I have a solid setup and I have not changed configuration for a while. When I go into the database and look at nagios_objects tables, I see all sorts of old objects which do not exist in my current setup. Does ndoutils clean up and throw away old config when we start nagios? No, but it marks them as inactive (or some such). -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] passive service check where display_name larger then 128 characters
Hello, The issue is that if I define a nagios service where the service description is larger than 128 characters, everything seem to work properly except that running a script which sends a passive service check via send_nsca the service in the Nagius gui is not updated although the send_nsca says it was successfully sent. Looking in the nagios log, I see that the passive check is arrived but that the service description is chopped at 128 chars. I wonder if anyone fixed this problem already? It looks to me that following line in include/common.h causes the issue #define MAX_DESCRIPTION_LENGTH128 I assume that I need to recompile nsca ( for the server ) and send_nsca ( for the client where I need to use a service description longer then 128). The problem is that as soon as I will use the new nsca binary on the server, I expect problems with all the servers which still are using the original send_ncsa. Anybody any idea's, suggestions or solutions. I am using the latest nsca and send_nsca 2.7.2 nsca is running on SUSE 10.2 send_nsca on different Unix and Linux falvours Thanks in advance, -- Groetjes, Paul -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] DNS down and false alerts...
Hey... I'm the OP. We're using a mix of client tools. For Windows systems (which aren't affected by this) we use nsclient++. For our Linux servers, NRPE... for UNIX (Solaris) and OS X we're using check_by_ssh. Both the NRPE and check_by_ssh clients are affected by this. I'm willing to give the caching nameserver on the server a try, but as others have noted, I don't think it will make a difference as its the local test on the client that's failing to resolv. I surely cannot do a caching nameserver setup on all clients... A. Davis Email: ncc...@gmail.com There is no limit to what a man can accomplish if he doesn't care who gets the credit. - Ronald Reagan Martin Melin wrote: I don't know if I'm misreading the OP, but if the plugins start timing out on only the boxes whose primary DNS is being rebooted, would adding a caching DNS server to the Nagios box really make a difference? I think the root cause to these timeouts is that the Nagios plugin timeout is happening before the connection to the primary DNS on the target machine has a chance to time out and then connect to the secondary DNS. The correct course of action to resolve this would be to either make sure that the DNS connection on the target machines fail quicker, or that Nagios/the plugin waits longer for a result from the check. The DNS failover is working as designed here but you're not giving it enough time to kick in. On Tue, Jun 9, 2009 at 5:37 PM, Russell Adams rlad...@adamsinfoserv.com mailto:rlad...@adamsinfoserv.com wrote: Really the best choice is to using caching DNS on the Nagios server. I'd recommend dnsmasq, it just does caching locally without needing to do big zone transfers. It has low overhead and simple configuration as a result. Enjoy. On Tue, Jun 09, 2009 at 11:19:20AM -0400, Andrew Davis wrote: I've observed an interesting issue with Nagios. Our environment is a mix of UNIX, Linux, Apple, and Windows. The core of the network is Active Directory including two AD servers that are both our primary, internal DNS servers. All non-Windows systems have a resolv.conf that looks like: *nameserver 10.1.1.13 nameserver 10.1.1.14 domain int.our.domain search int.our.domain* About half of the servers have the nameserver entries inverted (ie: .14 first, .13 second). The issue is that anytime one of the nameservers is rebooted (at least once a month if staying current on patches thanks to Black Tuesdays), whichever hosts have that nameserver listed first in its resolv.conf start throwing the following errors: *CRITICAL - Plugin timed out while executing system call.* This occurs for multiple tests for each host. Obviously, there's a name resolution correlation here. If the nameserver with .13 is rebooted, all hosts (about half of them) that list this IP first in their resolve.conf then timeout for multiple tests. If the .14 server is rebooted, all the other hosts timeout. Interestingly, none of the Windows clients issue errors... only UNIX, Linux, and Mac's... only those with an /etc/resolv.conf. The end result is a host of false positives, but more importantly it looks bad on availability reports and causes phones/pagers to go ballistic with unneeded emails. I'm trying to find a solution and I can't find one that I like: Solution 1) is to cluster the DNS servers. We have lots of clusters here. This isn't good, though, as you don't normally cluster DNS servers... they're meant to be redundant for a reason... one fails and it uses the next one. Solution 2) is to setup a service/host dependency. My thought would be either a host dependency that says if either .13 or .14 are down, then don't alert for any other host that uses them. Or a service to host dependency... if the DNS service is down, then don't alert on any of these dependent hosts. Honestly, I'm not sure if you can mix host and service dependencies like this... plus... if the DNS server is actually down, then the DNS service is down, so better to use a host dependency. The problem is that now we're not alerting on any dependent hosts which themselves could have a legitimate issue we want to know about. Plus, what happens if the DNS server actually dies and take a few hours/days to rebuild/restore? At this point, the dependent hosts aren't watched for a very long time. Solution 3) is to setup a UNIX/Linux DNS server that slaves all zones from the AD servers and have all UNIX/Linux/Apple clients query from this server. This would work except that A) I need two of them to keep redundancy and B) I've now added an extra layer of complication to
Re: [Nagios-users] Recovery notifications after escalations
On 06/09 15:47, Andreas Ericsson wrote: Marcus Rejås wrote: If you do not have that many contacts, create an additional one for each member in the on-call with only recovery-alerts and put them in a group, e.g. on-call-recovery and escalate to that one. They will now get the recovery notification. I don't think they will. There are checks to make sure recovery notifications are only sent to contacts who have received the previous problem notification. You are absolutely right (as always, however this time I took the time to test and prove myself wrong...). I am, and was, aware of the checks you are referring to and they really do make sense in most places e.g. leaving and entering timeperiods. But in this context they are confusing. If I set up a contact with: host_notifications_enabled 1 service_notifications_enabled 1 service_notification_period 24x7 host_notification_period24x7 host_notification_options r service_notification_optionsr It will never ever get any notifications. To be honest I cannot see any practical use of this contact but until I tested now I would say that this contact should get only recovery notifications. This is not something I would see fixed but it might be good to point this out on host- and service_notification_options in the manual. Sending a patch to make sure each problem object in the Nagios core contains a concatenated list of normal and escalated contacts would be favourite, since that would mean everyone who received the problem notification will also get the recovery notification. This would best be implemented by building a linked list with only unique elements to operate on. The list should probably contain a marker to mention which contacts were added from the escalation, so the original contacts do not get notified if they don't want to get the escalated notifications. I agree :-) -- Marcus Rejås jabber: mar...@jabber.rejas.se ,= ,-_-. =. Rejås Datakonsult e-mail: mar...@rejas.se((_/)o o(\_)) Kaserngatan 1 web: http://www.rejas.se `-'(. .)`-' s-761 46 Norrtäljegpg-key: http://gpg.rejas.se \_/ -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] passive service check where display_name larger then 128 characters
Hi Paul, On 9 Jun 2009, at 18:13, Paul Vaes wrote: The issue is that if I define a nagios service where the service description is larger than 128 characters, everything seem to work properly except that running a script which sends a passive service check via send_nsca the service in the Nagius gui is not updated although the send_nsca says it was successfully sent. Looking in the nagios log, I see that the passive check is arrived but that the service description is chopped at 128 chars. I wonder if anyone fixed this problem already? It looks to me that following line in include/common.h causes the issue #define MAX_DESCRIPTION_LENGTH128 I assume that I need to recompile nsca ( for the server ) and send_nsca ( for the client where I need to use a service description longer then 128). The problem is that as soon as I will use the new nsca binary on the server, I expect problems with all the servers which still are using the original send_ncsa. Anybody any idea's, suggestions or solutions. I am using the latest nsca and send_nsca 2.7.2 nsca is running on SUSE 10.2 send_nsca on different Unix and Linux falvours Yes, we've spotted this too. There is a limitation in NSCA where the hostname is 63 characters, the service description is limited to 127 characters and the output is limited to 511 bytes. The overall NSCA packet size is 716 bytes. We've been looking into making this packet size variable while still maintaining compatibility with existing send_nsca clients (we've done something similar with NRPE: http://opsview-blog.opsera.com/dotorg/2008/08/enhancing-nrpe.html) . Contact me off list if you are interested in sponsoring Opsera to develop this functionality. Ton -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] disk IO for windows?
That is partialy right, The Alt delete thing is To get to TaskManager but only on New versions of windows does it give access to some counter names, The best place to go is Performance Monitor Since thats on all version of windows since 2000 Control Panel-- Administrative Tools-- Computer Managment-- then Performance Counter on newer systems from Computer Management -- Reliability and performance -- Monitoring Tools -- Performance Monitor Once you FIND performance Monitor then click the Green + to get into the add counters Click the Checkbox to Show the Counter description then click arround till you find what you need Look for Disks for Drive stuff, Tony (Author of NC_NEt) On Tue, Jun 9, 2009 at 9:24 AM, Andreas Ericsson a...@op5.se wrote: dave stern - e-mail.pluribus.unum wrote: Anyone know of a plug-in or mechanism to log local disk I/O on windows? My nagios server is currently using check_nt to connect to windows hosts via nsclient++. I was hoping perhaps COUNTER has something buried within it to pull down this info. There are indeed counters for that, but due to Microsoft's stupidity the counter-names are different depending on which base-language you've used for your windows servers. I don't know what they're named for english platforms (or any other for that matter), but you should be able to view them with that thing you can pop up when pressing ctrl-alt-del (task manager or whatever it's called). -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] disk IO for windows?
I use Disk Idle time as an indicator. Not an original idea :( I was told to mimic the monitoring built into Windows SBS. Curtis LaMasters http://www.curtis-lamasters.com http://www.builtnetworks.com On Tue, Jun 9, 2009 at 8:27 PM, Anthony Montibelloamontibe...@gmail.com wrote: That is partialy right, The Alt delete thing is To get to TaskManager but only on New versions of windows does it give access to some counter names, The best place to go is Performance Monitor Since thats on all version of windows since 2000 Control Panel-- Administrative Tools-- Computer Managment-- then Performance Counter on newer systems from Computer Management -- Reliability and performance -- Monitoring Tools -- Performance Monitor Once you FIND performance Monitor then click the Green + to get into the add counters Click the Checkbox to Show the Counter description then click arround till you find what you need Look for Disks for Drive stuff, Tony (Author of NC_NEt) On Tue, Jun 9, 2009 at 9:24 AM, Andreas Ericsson a...@op5.se wrote: dave stern - e-mail.pluribus.unum wrote: Anyone know of a plug-in or mechanism to log local disk I/O on windows? My nagios server is currently using check_nt to connect to windows hosts via nsclient++. I was hoping perhaps COUNTER has something buried within it to pull down this info. There are indeed counters for that, but due to Microsoft's stupidity the counter-names are different depending on which base-language you've used for your windows servers. I don't know what they're named for english platforms (or any other for that matter), but you should be able to view them with that thing you can pop up when pressing ctrl-alt-del (task manager or whatever it's called). -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Error while configuring NRPE on solaris
This problem is still unresolvable. I have tried all possible situations but no use. I can still see error in dmesg when i run /usr/local/nagios/libexec/check_nrpe -H localhost CHECK_NRPE: Error - Could not complete SSL handshake. Error i see when i run dmesg, svc:/network/nrpe/tcp:default (chdir: No such file or directory) Jun 10 10:15:53 unknown inetd[7268]: [ID 702911 daemon.error] Failed to set credentials for the inetd_start method of instance svc:/network/nrpe/tcp:default (chdir: No such file or directory) Jun 10 10:15:59 unknown inetd[7276]: [ID 702911 daemon.error] Failed to set credentials for the inetd_start method of instance svc:/network/nrpe/tcp:default (chdir: No such file or directory) I am using SunOS 5.10 Generic_120012-14 i86pc i386 i86pc Thanks, Nilesh Luc I. Suryo l...@suryo.com 05/29/2009 10:07 PM Please respond to Luc I. Suryo l...@suryo.com To Eric Pearce epea...@amberpoint.com cc N Patil n.pa...@lntinfotech.com, Nagios Users Mailinglist nagios-users@lists.sourceforge.net Subject Re: [Nagios-users] Error while configuring NRPE on solaris fyi I have been using nagios and nrpe 9-10 years now; sparc and x86, started back with solaris 7 and now soalris 10, zero error mix solaris, aix, hpux, linux. The server has always been Solaris (sparc or x86), use inetd/xinetd/deamon mode again zero error The one 'problem' i have seen people complain about is ssl and nrpe, read the manual and it should pretty clear what todo, 99.9% is almost the use not having doing some RTFM thingy :) The other one is tcp-wrapper and nrpe, nrpe has a access control buildt-in so I never understood one would need to use tcp-wrapper :) -ls From: N Patil To: Eric Pearce Cc: Nagios Users Mailinglist Sent: Thursday, May 28, 2009 9:13 PM Subject: Re: [Nagios-users] Error while configuring NRPE on solaris Thanks Eric, I have followed the same article but it dint help. This problem is something which occured at the end, i mean while testing connectivity. Thanks, Nilesh May 28 19:15:27 solaris10.remotehost.com inetd[24241]: [ID 702911 daemon.error] Failed to set credentials for the inetd_start method of instance svc:/network/nrpe/tcp:default (chdir: No such file or directory) I'm just guessing, but do you have a home directory for the nagios user (with owner and group set to nagios)? The chdir error might come from this. -e -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null __ __-- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null