Re: [Nagios-users] NDO update delay
Hi Lei. I´m using MyISAM engine. Thanks, Rodney. On Tue, Mar 31, 2009 at 4:59 AM, lei chen clo...@gmail.com wrote: The primary reason maybe the low performance in your mysql server with innodb engine. Two solutions: 1. tuning your my.cnf to improve the mysql performance; 2. use MyISAM instead of InnoDB engine in your NODUtils db; 2009/3/26 Rodney Ramos rodne...@gmail.com: Hi everybody. I´ve installed NDOUtils (nodutils-1.4b7) whith Nagios (nagios-3.0.6) and it´s working. However I´ve noticed that it´s taking more than 30 minutes to update the MySQL tables. I´ve detected the problem making a query on the field last_check on nagios_hoststatus table. Can anyone help me? There is a parameter to make the update process faster? Thanks, Rodney. -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Thanks, Chenlei 石头++ MSN Messenger: c...@163.com -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NDO update delay
Sorry, Jim, but my problem is not with Nagios startup. The problem is that NDOUtils does not update MySQL tables fast. It takes more than 30 minutos do make an update. As I´ve told, I´ve noticed that making a query on the nagios_hoststatus table. Does anybody has a similiar problem? Has anybody made a query on this table and compare de last_chek value with the Last Check value showed by Nagios? Thanks, Rodney. On Thu, Mar 26, 2009 at 4:51 PM, Jim Avery j...@jimavery.me.uk wrote: 2009/3/26 Rodney Ramos rodne...@gmail.com: Hi everybody. I´ve installed NDOUtils (nodutils-1.4b7) whith Nagios (nagios-3.0.6) and it´s working. However I´ve noticed that it´s taking more than 30 minutes to update the MySQL tables. I´ve detected the problem making a query on the field last_check on nagios_hoststatus table. Can anyone help me? There is a parameter to make the update process faster? You can try reducing the number of days data you keep and adding some indexes. The solution I think helps most though was contributed by Marc DeTrano here in a thread here on 3rd/4th March which I summarised thus: I simply added a file nagios.cnf under /etc/mysql/conf.d like so: [mysqld] innodb_flush_log_at_trx_commit = 2 and it reduced my Nagios startup time from more than 3 minutes to only 30 seconds! I already had quite a few of the data_processing_options disabled in ndomod.cfg and had reduced all of the max_*_age parameters in ndo2db.cfg to 24 hours (before those my startup time was more than 5 minutes). -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] NDO update delay
Hi everybody. I´ve installed NDOUtils (nodutils-1.4b7) whith Nagios (nagios-3.0.6) and it´s working. However I´ve noticed that it´s taking more than 30 minutes to update the MySQL tables. I´ve detected the problem making a query on the field last_check on nagios_hoststatus table. Can anyone help me? There is a parameter to make the update process faster? Thanks, Rodney. -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Check attempt number do not reset
I´m noticing that, sometimes, the check attempt number do not reset to 1 after a recovery state, as we can see below: [1236254782] HOST ALERT: sani6as05;DOWN;SOFT;1;No route to host [1236254852] HOST ALERT: sani6as05;UP;SOFT;2;TCP OK - 0.519 second response time on port 135 [1236255172] HOST ALERT: sani6as05;DOWN;SOFT;2;CRITICAL - Socket timeout after 10 seconds Should be SOFT:1, not SOFT:2 [1236255242] HOST ALERT: sani6as05;DOWN;SOFT;3;Network is unreachable [1236255312] HOST ALERT: sani6as05;DOWN;HARD;4;Network is unreachable Is this a normal behaviour or is it a Nagios problem? Thanks in advance, Rodney. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check attempt number do not reset
Hi, Marc. Thank you for your answer. Do you think that just after a HARD UP the check number should be reset to 1? I don´t think so. Look at other messages log, for the same host: [1236260301] HOST ALERT: sani6as05;DOWN;SOFT;1;Network is unreachable [1236260361] HOST ALERT: sani6as05;UP;SOFT;2;TCP OK - 0.538 second response time on port 135 [1236260681] HOST ALERT: sani6as05;DOWN;SOFT;1;Network is unreachable [1236260751] HOST ALERT: sani6as05;DOWN;SOFT;2;Network is unreachable [1236260821] HOST ALERT: sani6as05;DOWN;SOFT;3;Network is unreachable [1236260891] HOST ALERT: sani6as05;DOWN;HARD;4;Network is unreachable In this example, after the UP;SOFT;2 we had a DOWN;SOFT;1. That is my doubt? Someties Nagios resets the check number, sometimes it doesn´t. Thank you again. Rodney. On Thu, Mar 5, 2009 at 12:22 PM, Marc Powell m...@ena.com wrote: On Mar 5, 2009, at 8:53 AM, Rodney Ramos wrote: I´m noticing that, sometimes, the check attempt number do not reset to 1 after a recovery state, as we can see below: [1236254782] HOST ALERT: sani6as05;DOWN;SOFT;1;No route to host [1236254852] HOST ALERT: sani6as05;UP;SOFT;2;TCP OK - 0.519 second response time on port 135 ^ Wasn't in a HARD OK state yet so hadn't _really_ recovered. [1236255172] HOST ALERT: sani6as05;DOWN;SOFT;2;CRITICAL - Socket timeout after 10 seconds Should be SOFT:1, not SOFT:2 [1236255242] HOST ALERT: sani6as05;DOWN;SOFT;3;Network is unreachable [1236255312] HOST ALERT: sani6as05;DOWN;HARD;4;Network is unreachable Is this a normal behaviour or is it a Nagios problem? Doesn't appear to be a bug, IMHO. -- Marc -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] NRPE Problems
I´m testing Nagios NRPE and I´m finding serveral problems. I´m working with: - Nagios 2.9 on Solaris 9 (last CVS patch). - NRPE 2.8.1 on Solaris 8 (last CVS patch). - gcc 3.4.6 1) When I try to start a daemon using NRPE, I receive the message CHECK_NRPE: Socket timeout after 10 seconds., although the service is started. check_nrpe -H remote_machine -n -p 5666 -c start_daemon CHECK_NRPE: Socket timeout after 10 seconds. In the remote machine I have in the nrpe.cfg: command[start_daemon]=/usr/local/bin/sudo /usr/local/nagios/bin/teste_daemon My teste_daemon is: #!/usr/bin/perl use POSIX; chroot(/usr/local/nagios/bin) or die Couldn't chroot: $!; $pid = fork; if ($pid) { print OK\n; exit 0; } die Couldn't fork: $! unless defined($pid); POSIX::setsid() or die Can't start a new session: $!; while (1) { sleep 10; } exit 0; --- It seems that the NRPE waits an output message from the teste_daemon that does not come (?!). 2) When I try to check a process using the plugin check_procs via NRPE the number of process running is added by one. Example: check_nrpe -H remote_machine -n -p 5666 -c check_daemon PROCS CRITICAL: 2 processes with args 'teste_daemon' In the nrpe.cfg, I have: command[check_daemon]=/usr/local/nagios/libexec/check_procs -c 1:1 -a teste_daemon But, when I run check_procs -c 1:1 -a teste_daemon on the remote machine, I have: PROCS OK: 1 process with args 'teste_daemon' The solution was to change the command to check_procs -c 1:1 -p 1 -a teste_daemon, but it isn´t what I´m looking for. 3) I couldn´t configure NRPE to run with the inetd. It always answers with a SSL error message, even with the -n flag in both side. So I found several problems with NRPE that are difficulting my job. I was intending to put Nagios to monitoring more than 3500 machines but after this problems I don´t know if other people of my group will feel comfortable to use this tool. Thats is a pity, because I found Nagios a excellent monitoring tool, very flexible, but I don´t know if other people will buy the fight to change our actual tool to Nagios. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios SIGHUP problem
I´m having a problem when I try to reload Nagios 2.9. When I run kill -HUP nagios_pid, it ends up with SIGEXIT. In the nagios.log we have Caught SIGEXIT, shutting down I´m running Nagios 2.9 in Solaris 9, gcc-3-4-6. I´ve read that this a problema in the char *sigs[] definition on the nagios.c file. Does anyone knows how to solve this problema? Thanks. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null