[Nagios-users] How to route data from multiple nagios core nodes to a nagiosxi node?
I have about ten nagios core machines that I currently monitor collectively using MNTOS. Is there a way to feed the data from my nagios core machines to a nagiosXI machine that makes it possible to use the nagiosXI features like the visualization/dashboards/reporting for the services/hosts being monitored by the nagios core nodes? I basically want to replace MNTOS w/ a nagiosXI machine so that I can utilize its features as a dashboard and reporting node for all the data I receive at each nagios core server. Basically, I want to set up a hub and spoke w/ a nagiosXI machine at the hub and all my nagios core boxes as spokes. I looked into DNX but this looks like it distributes the checks in a different way. Any ideas? Thanks, Ben -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_dig not working as expected (plugins 1.4.15)
Hello, Well, unless I'm very wrong, check_dns doesn't permit to choose what type of record you want to query (eg. using -q=MX nslookup command line option). Kind regards, Benjamin Le 25 avr. 2011 à 03:37, Yueh-Hung Liu a écrit : use check_dns On Sun, Apr 24, 2011 at 9:45 PM, Benjamin KRAFT b...@dotnul.com wrote: Hello, I just wanted to pop-out if nobody already did that check_dig doesn't behave like it should on my centos machine. I was trying to validate the fact that my authoritative server at 80.92.90.249 resolves my website to the corresponding ip address, like this : [root@nagg ns1]# /usr/local/nagios/libexec/check_dig -H 80.92.90.249 -l www.dotnul.com -a 78.46.96.219 -A +tcp DNS OK - 0.009 seconds response time (www.dotnul.com. 3600 IN A 78.46.96.219)|time=0.009035s;;;0.00 That's ok - but I noticed when I misspasted the expected return IP address, that the check returns OK if I put the address of the queried server as expected result. [root@nagg ns1]# /usr/local/nagios/libexec/check_dig -H 80.92.90.249 -l www.dotnul.com -a 80.92.90.249 -A +tcp DNS OK - 0.006 seconds response time (;; SERVER: 80.92.90.249#53(80.92.90.249))|time=0.006368s;;;0.00 The return of the dig query is : [root@nagg ns1]# dig www.dotnul.com @80.92.90.249 +tcp ; DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 www.dotnul.com @80.92.90.249 +tcp ;; global options: printcmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 1842 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.dotnul.com.IN A ;; ANSWER SECTION: www.dotnul.com. 3600IN A 78.46.96.219 ;; AUTHORITY SECTION: dotnul.com. 86400 IN NS ether.dotnul.org. dotnul.com. 86400 IN NS hypnos.dotnul.org. dotnul.com. 86400 IN NS tanathos.dotnul.org. ;; Query time: 1 msec ;; SERVER: 80.92.90.249#53(80.92.90.249) ;; WHEN: Sun Apr 24 13:44:12 2011 ;; MSG SIZE rcvd: 122 By having an eye on nagios-plugins-1.4.15/plugins/check_dig.c, it looks like the following code is involved : for(i = 0; i chld_out.lines; i++) { /* the server is responding, we just got the host name... */ if (strstr (chld_out.line[i], ;; ANSWER SECTION:)) { /* loop through the whole 'ANSWER SECTION' */ for(; i chld_out.lines; i++) { /* get the host address */ if (verbose) printf (%s\n, chld_out.line[i]); if (strstr (chld_out.line[i], (expected_address == NULL ? query_address : expected_address)) != NULL) { msg = chld_out.line[i]; result = STATE_OK; /* Translate output TAB - SPACE */ t = msg; while ((t = strchr(t, '\t')) != NULL) *t = ' '; break; } } if (result == STATE_UNKNOWN) { msg = (char *)_(Server not found in ANSWER SECTION); result = STATE_WARNING; } /* we found the answer section, so break out of the loop */ break; } } Shouldn't the command stop reading any following line beginning with ;; after finding the ANSWER section instead of searching a match in everything until end of chld_out ? Kind regards, Benjamin -- Fulfilling the Lean Software Promise Lean software platforms are now widely adopted and the benefits have been demonstrated beyond question. Learn why your peers are replacing JEE containers with lightweight application servers - and what you can gain from the move. http://p.sf.net/sfu/vmware-sfemails ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Fulfilling the Lean Software Promise Lean software platforms are now widely adopted and the benefits have been demonstrated beyond question. Learn why your peers are replacing JEE containers with lightweight application servers - and what you can gain from the move. http://p.sf.net/sfu/vmware-sfemails ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Fulfilling the Lean Software Promise Lean software platforms are now widely adopted and the benefits have been demonstrated
[Nagios-users] check_dig not working as expected (plugins 1.4.15)
Hello, I just wanted to pop-out if nobody already did that check_dig doesn't behave like it should on my centos machine. I was trying to validate the fact that my authoritative server at 80.92.90.249 resolves my website to the corresponding ip address, like this : [root@nagg ns1]# /usr/local/nagios/libexec/check_dig -H 80.92.90.249 -l www.dotnul.com -a 78.46.96.219 -A +tcp DNS OK - 0.009 seconds response time (www.dotnul.com. 3600 IN A 78.46.96.219)|time=0.009035s;;;0.00 That's ok - but I noticed when I misspasted the expected return IP address, that the check returns OK if I put the address of the queried server as expected result. [root@nagg ns1]# /usr/local/nagios/libexec/check_dig -H 80.92.90.249 -l www.dotnul.com -a 80.92.90.249 -A +tcp DNS OK - 0.006 seconds response time (;; SERVER: 80.92.90.249#53(80.92.90.249))|time=0.006368s;;;0.00 The return of the dig query is : [root@nagg ns1]# dig www.dotnul.com @80.92.90.249 +tcp ; DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 www.dotnul.com @80.92.90.249 +tcp ;; global options: printcmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 1842 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.dotnul.com.IN A ;; ANSWER SECTION: www.dotnul.com. 3600IN A 78.46.96.219 ;; AUTHORITY SECTION: dotnul.com. 86400 IN NS ether.dotnul.org. dotnul.com. 86400 IN NS hypnos.dotnul.org. dotnul.com. 86400 IN NS tanathos.dotnul.org. ;; Query time: 1 msec ;; SERVER: 80.92.90.249#53(80.92.90.249) ;; WHEN: Sun Apr 24 13:44:12 2011 ;; MSG SIZE rcvd: 122 By having an eye on nagios-plugins-1.4.15/plugins/check_dig.c, it looks like the following code is involved : for(i = 0; i chld_out.lines; i++) { /* the server is responding, we just got the host name... */ if (strstr (chld_out.line[i], ;; ANSWER SECTION:)) { /* loop through the whole 'ANSWER SECTION' */ for(; i chld_out.lines; i++) { /* get the host address */ if (verbose) printf (%s\n, chld_out.line[i]); if (strstr (chld_out.line[i], (expected_address == NULL ? query_address : expected_address)) != NULL) { msg = chld_out.line[i]; result = STATE_OK; /* Translate output TAB - SPACE */ t = msg; while ((t = strchr(t, '\t')) != NULL) *t = ' '; break; } } if (result == STATE_UNKNOWN) { msg = (char *)_(Server not found in ANSWER SECTION); result = STATE_WARNING; } /* we found the answer section, so break out of the loop */ break; } } Shouldn't the command stop reading any following line beginning with ;; after finding the ANSWER section instead of searching a match in everything until end of chld_out ? Kind regards, Benjamin -- Fulfilling the Lean Software Promise Lean software platforms are now widely adopted and the benefits have been demonstrated beyond question. Learn why your peers are replacing JEE containers with lightweight application servers - and what you can gain from the move. http://p.sf.net/sfu/vmware-sfemails ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] bug in check_logfiles? existing file is reported as not found
Hi list, Tried unsuccessfully to find a support forum for check_logfiles, maybe someone here can help me ;-) I have check_logfiles running on a win2k3 machine and it checks for a file using a date pattern (see config snipped below). This morning nagios gave an error that the logfile with todays date (ie 2009-12-30.log) did not exist - but it did, this has been working for some weeks without problems until now. Exchanging the date pattern with the actual file name gave the same result - check_logfiles claims the file did not exist. Trying with any other file (fx 2009-12-29.log or 2009-12-31.log) in the same location gives no problem. Any ideas how to debug further on this? Otherwise it looks like the problem will disappear tomorrow as my created file with tomorrows datepattern works ok. Thanks Mirco Check_logfiles.cfg: @searches = ( { logfile = 'C:\STEP2CIFileMover\logs\$CL_DATE_$-$CL_DATE_MM$-$CL_DATE_DD$.log', criticalpatterns = 'ERROR', options = 'noprotocol,perfdata,nocase,sticky=28800' }, ); Mirco Drick | Systems Administrator Stibo Systems A/S MASTERING Data Management T +45 89 39 11 11 www.stibosystems.com This e-mail is intended for the addressee only and may contain confidential information. If you are not the intended recipient, you must not copy, distribute or take any action in reliance on it. If this email is sent to you in error, please notify us immediately by telephone or by e-mail. -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Really slow log2ndo
Am I the only one who is experiencing very slow performance importing archived log files into MySQL with log2ndo? I'm averaging ~1.5 log entries per second being inserted. This seems slow to me. I've tried dividing the parsing MySQL service across 2 servers as well as move all the parsing/inserting locally on a remote (to Nagios) server which is much beefier than our actual Nagios server. My insertion speed remains fairly consistent. I'm currently using the Unix socket option. Is this the wrong one? Would TCP be faster (doesn't seem so)? Should I process the logs to flat text files first then push those to ndo2db? At this rate it's going to take FOREVER! I have ~10 million archived lines to parse and insert. Maybe I'm expecting too much? Benjamin Krein Systems Administrator AWeber Communications, Inc. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios Latency
Hi Andreas, Thanks for answering . I have more or less followed the documentation on how to tune nagios. service_reaper_frequency=2 max_concurrent_checks=0 service_interleave_factor=s host_inter_check_delay_method=s The host check command is check_icmp We use nagios 2.7.1 The nagios server is not overloaded 4Go RAM and dual core 3GHz. But still the nagios latency is increasing for no particular reason ... see the graph attached. Cheers Ben Andreas Ericsson wrote: Benjamin Cleyet-Marrel wrote: Hi, I am experiencing problems with Nagios latency that I can't understand. Upon Nagios startup or upon Nagios restart, the latency is bellow 0.1second which is fine. After a couple of days, without any change, and even though the execution time and the Nagios server load remain the same the latency slowly increase. It seems that it is an exponential growth. All the checks kicks in later and later, no exception. Usually after a week the latency is up to 60seconds and after 2 week up to 5 minutes. I have currently set up a Nagios restart every monday, but I would like to find a better solution and understand why does the latency keeps increasing. We monitor over 50 hosts and have over 1400 checks every 5 minutes. The nagios server is quite big and is not really loaded. Any Idea would be appreciated Three things to check: service_reaper_frequency check_interleave_factor max_parallell_something_something I haven't done my morning routine yet, so you'll have to figure out the real variable names in case my memory is a bit off (which it no doubt is). Other than that.. What nagios version are you running, and how is your host check command defined? -- -- Benjamin Cleyet-Marrel Consultant OSS: Open Systems Specialists Ltd phone: 09-984-3000 fax: 09-984-3001 mobile: 021-721-869 email: [EMAIL PROTECTED] web: www.oss.co.nz post:P.O. Box 8833, Auckland -- inline: nagios-latency.png- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios Latency
Hi, I am experiencing problems with Nagios latency that I can't understand. Upon Nagios startup or upon Nagios restart, the latency is bellow 0.1second which is fine. After a couple of days, without any change, and even though the execution time and the Nagios server load remain the same the latency slowly increase. It seems that it is an exponential growth. All the checks kicks in later and later, no exception. Usually after a week the latency is up to 60seconds and after 2 week up to 5 minutes. I have currently set up a Nagios restart every monday, but I would like to find a better solution and understand why does the latency keeps increasing. We monitor over 50 hosts and have over 1400 checks every 5 minutes. The nagios server is quite big and is not really loaded. Any Idea would be appreciated Cheers Ben -- -- Benjamin Cleyet-Marrel Consultant OSS: Open Systems Specialists Ltd phone: 09-984-3000 fax: 09-984-3001 mobile: 021-721-869 email: [EMAIL PROTECTED] web: www.oss.co.nz post:P.O. Box 8833, Auckland -- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Forcing recovery notifications to email only
For the sake of others who may care... I fixed this by making the *-recovery contacts the default contacts which get added to the admins contactgroup. These contacts get notifications for [d|c],[u|r] for hosts/services. In addition, I only make paging contacts for those who are on pager duty and also add them to the admins contactgroup. The paging contacts only get notifications for [d|c] for hosts/services. Ben Benjamin Krein wrote: I have setup my contacts file such that each contact has 3 definitions. The entire contacts file is generated by an external script that pulls from a DB so that we can rotate pager duty. The following contact definitions are setup for each contact: 1. Default contact (may be either email-only or email + epager depending on where the person is in the rotation) define contact{ use tpl_dev_contacts contact_name benk aliasBen Krein contactgroupsadmins service_notification_commandsnotify-by-email, notify-by-epager host_notification_commands host-notify-by-email, host-notify-by-epager host_notification_optionsd service_notification_options c email[EMAIL PROTECTED] pager[EMAIL PROTECTED] } 2. Contact for email-only notifications define contact{ use tpl_dev_contacts contact_name benk-email aliasBen Krein (email) contactgroupsadmins-email service_notification_commandsnotify-by-email host_notification_commands host-notify-by-email host_notification_optionsd service_notification_options c email[EMAIL PROTECTED] } 3. Recovery contact (recoveries should only go to email) define contact{ use tpl_dev_contacts contact_name benk-recovery aliasBen Krein (recovery) contactgroupsadmins, admins-email host_notification_optionsr service_notification_options r service_notification_commandsnotify-by-email host_notification_commands host-notify-by-email email[EMAIL PROTECTED] } However, when Nagios runs recovers a service/host, the following is displayed in the nagios.log no notifications are sent for the recovery: - [1185479942] Warning: Host recovery notification option for contact 'benk-recovery' doesn't make any sense - specify down and/or unreachable options as well [1185479942] Warning: Service recovery notification option for contact 'benk-recovery' doesn't make any sense - specify critical and/or warning options as well - How can I make Nagios only send recoveries to email? Ben - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Forcing recovery notifications to email only
I have setup my contacts file such that each contact has 3 definitions. The entire contacts file is generated by an external script that pulls from a DB so that we can rotate pager duty. The following contact definitions are setup for each contact: 1. Default contact (may be either email-only or email + epager depending on where the person is in the rotation) define contact{ use tpl_dev_contacts contact_name benk aliasBen Krein contactgroupsadmins service_notification_commandsnotify-by-email, notify-by-epager host_notification_commands host-notify-by-email, host-notify-by-epager host_notification_optionsd service_notification_options c email[EMAIL PROTECTED] pager[EMAIL PROTECTED] } 2. Contact for email-only notifications define contact{ use tpl_dev_contacts contact_name benk-email aliasBen Krein (email) contactgroupsadmins-email service_notification_commandsnotify-by-email host_notification_commands host-notify-by-email host_notification_optionsd service_notification_options c email[EMAIL PROTECTED] } 3. Recovery contact (recoveries should only go to email) define contact{ use tpl_dev_contacts contact_name benk-recovery aliasBen Krein (recovery) contactgroupsadmins, admins-email host_notification_optionsr service_notification_options r service_notification_commandsnotify-by-email host_notification_commands host-notify-by-email email[EMAIL PROTECTED] } However, when Nagios runs recovers a service/host, the following is displayed in the nagios.log no notifications are sent for the recovery: - [1185479942] Warning: Host recovery notification option for contact 'benk-recovery' doesn't make any sense - specify down and/or unreachable options as well [1185479942] Warning: Service recovery notification option for contact 'benk-recovery' doesn't make any sense - specify critical and/or warning options as well - How can I make Nagios only send recoveries to email? Ben - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Strange contact group behavior in 2.0b3
I'm running Nagios v2.0b3 and have a strange problem with contact groups Here is a breakdown of the issues: - I create 2 contacts, one called 'brutter' and another 'tward' - I then create a contact group called 'TRMStaff' and add both users - Any service I set the contact group of 'TRMStaff' to works flawlessly - Any host I set the contact group of 'TRMStaff' to works flawlessly So at this point, all is well...then the strange part - I create a new contact called 'lgroup' - I then create a contact group called 'LG' and add the 'lgroup' user - Any service I set the contact group of 'LG' to works fine AS LONG AS the HOST of that service is set to 'TRMStaff' as the contact group. If I have a host with contact group 'TRMStaff' and a service of that host with contact group 'LG', everything works as it should. As soon as I change the contact group on the host to 'LG', that host disappears from the list of hosts in the GUI and alerts are not generated for the hosts OR the services related to that host. I have tried creating multiple groups with all sorts of different names and assigning different users to no avail I need to be able to have the 'LG' contact group assigned to both HOSTS and it's corresponding services. I have checked the syntax up and down a hundred times and everything seems to be on point. Is there anything illegal in the group or contact names that I've chosen? Any suggestions out there??? This is killing me... --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37alloc_id865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null