[Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
Hi, I've been running Nagios for months, without any problem. Today, I tried to restart it, as I do each time I do a change in the config files. But it does not want to work anymore ! - Web server always displays Error: Could not read host and service status information, the same as when Nagios process is not running - The Nagios process is running, ps -A | grep nagios shows a process (and only one) - It answers correctly to start and stop commands (/etc/rc.d/nagios start|stop|status|restart on OpenSuse) - Restarting the server did not change anything - Of course, there are no errors in my config. Checking config is successful, there's nothing special in /var/log/nagios/config.err Moreover, nagios.log does not display any error. Here's what the latest entries look like : Informational Message[03-17-2009 12:05:30] Finished daemonizing... (New PID=2874) Informational Message[03-17-2009 12:05:29] Event broker module '/usr/lib/nagios/brokers/ndomod.o' initialized successfully. Informational Message[03-17-2009 12:05:29] ndomod: Successfully connected to data sink. 0 queued items to flush. Informational Message[03-17-2009 12:05:29] ndomod: NDOMOD 1.4b7 (10-31-2007) Copyright (c) 2005-2007 Ethan Galstad (nag...@nagios.org) Informational Message[03-17-2009 12:05:29] LOG VERSION: 2.0 Informational Message[03-17-2009 12:05:29] Local time is Tue Mar 17 12:05:29 CET 2009 Program Start[03-17-2009 12:05:29] Nagios 3.0.1 starting... (PID=2815) I tried some google search and lisr archive browsing, but most of these errors are caused by Nagios process not starting due to a configuration mistake. That 's not the case for me, because the process starts, and it does not complain about configuration errors... Any idea ? Are there any other log or debug files where I could have a look to see what's going wrong ? Thank you in advance. Kind regards -- *Toussaint OTTAVI* *MEDI INFORMATIQUE* -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
2009/3/17 Toussaint OTTAVI t.ott...@medi.fr: Hi, I've been running Nagios for months, without any problem. Today, I tried to restart it, as I do each time I do a change in the config files. But it does not want to work anymore ! - Web server always displays Error: Could not read host and service status information, the same as when Nagios process is not running - The Nagios process is running, ps -A | grep nagios shows a process (and only one) - It answers correctly to start and stop commands (/etc/rc.d/nagios start|stop|status|restart on OpenSuse) - Restarting the server did not change anything - Of course, there are no errors in my config. Checking config is successful, there's nothing special in /var/log/nagios/config.err Moreover, nagios.log does not display any error. Here's what the latest entries look like : Informational Message[03-17-2009 12:05:30] Finished daemonizing... (New PID=2874) Informational Message[03-17-2009 12:05:29] Event broker module '/usr/lib/nagios/brokers/ndomod.o' initialized successfully. Informational Message[03-17-2009 12:05:29] ndomod: Successfully connected to data sink. 0 queued items to flush. Informational Message[03-17-2009 12:05:29] ndomod: NDOMOD 1.4b7 (10-31-2007) Copyright (c) 2005-2007 Ethan Galstad (nag...@nagios.org) Informational Message[03-17-2009 12:05:29] LOG VERSION: 2.0 Informational Message[03-17-2009 12:05:29] Local time is Tue Mar 17 12:05:29 CET 2009 Program Start[03-17-2009 12:05:29] Nagios 3.0.1 starting... (PID=2815) I tried some google search and lisr archive browsing, but most of these errors are caused by Nagios process not starting due to a configuration mistake. That 's not the case for me, because the process starts, and it does not complain about configuration errors... Any idea ? Are there any other log or debug files where I could have a look to see what's going wrong ? I guess it could be there's a problem with your MySQL database. I'd check the MySQL logs and maybe try disabling the event broker temporarily (event_broker_options and broker_module directives in your nagios.cfg) to make sure Nagios will work okay without ndomod - if it does then your problem lies somewhere in the ndomod config or MySQL database. hth, Jim -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
Marc Powell a écrit: On Mar 17, 2009, at 6:49 AM, Toussaint OTTAVI wrote: I've been running Nagios for months, without any problem. Today, I tried to restart it, as I do each time I do a change in the config files. But it does not want to work anymore ! instinct tells me it's NDO updating the database. Do you see the mysql process busy during this time? If your nagios database is large it may take a few minutes for nagios to get going. Either wait and see or disable NDO and see if that help. There are approx. 70 hosts and 500 services. It usually takes 2-3 minutes to start. But this morning, it didn't start after 10-15 minutes... Then, I went to lunch, and one hour later, it was up and working ! So, you were right, it's just a delay problem... As far as I remember, I installed MySQL and NDOUtils to test the NagVis extension. But it was just a test, and I don't use it. So, maybe I could disable NDO completely. Are there any good reasons for me to use NDO (Other than writing my own software to query MySQL databases directly) ? Kind regards -- *Toussaint OTTAVI* *MEDI INFORMATIQUE* ** -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
On Mar 17, 2009, at 8:31 AM, Toussaint OTTAVI wrote: Are there any good reasons for me to use NDO (Other than writing my own software to query MySQL databases directly) ? No, unless you find other third party software that uses it. -- Marc -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
2009/3/17 Toussaint OTTAVI t.ott...@medi.fr: As far as I remember, I installed MySQL and NDOUtils to test the NagVis extension. But it was just a test, and I don't use it. So, maybe I could disable NDO completely. Are there any good reasons for me to use NDO (Other than writing my own software to query MySQL databases directly) ? I only use NDO so that I can run NagVis (and jolly good it is too!). If you don't need to query MySQL for anything, then no I wouldn't bother running NDO. Cheers, Jim -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
Jim Avery a écrit: 2009/3/17 Toussaint OTTAVI t.ott...@medi.fr: As far as I remember, I installed MySQL and NDOUtils to test the NagVis extension. But it was just a test, and I don't use it. So, maybe I could disable NDO completely. Are there any good reasons for me to use NDO (Other than writing my own software to query MySQL databases directly) ? I only use NDO so that I can run NagVis (and jolly good it is too!). If you don't need to query MySQL for anything, then no I wouldn't bother running NDO. NagVis is not a priority here. No time to play with nice graphics :-) Moreover, having all the Nagios information in a MySQL database points me out a simple way to solve some other problems for which I didn't find any solution yet (for example, filtering out wrong service check results when a host is unreachable due to a wan failure). Then, I'll investigate further how NDO and MySQL work... And, maybe, I'll offer a new server with decent performance to my Nagios (in replacement of the good old hp Proliant of 8 years old...) Thank you Mark for the answers. Kind regards -- *Toussaint OTTAVI* *MEDI INFORMATIQUE* -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios stopped working : process running, no errors, but nothing in the web server
2009/3/17 Toussaint OTTAVI t.ott...@medi.fr: Then, I'll investigate further how NDO and MySQL work... And, maybe, I'll offer a new server with decent performance to my Nagios (in replacement of the good old hp Proliant of 8 years old...) Marc DeTrano from Gridshield recently contributed an excellent solution to the slow startup problem which I summarised like so: I simply added a file nagios.cnf under /etc/mysql/conf.d like so: [mysqld] innodb_flush_log_at_trx_commit = 2 and it reduced my Nagios startup time from more than 3 minutes to only 30 seconds! Note that this change can mean you might lose a second or two of data in the event of a database crash, but for me that's no problem. A few other solutions have been discussed in this email list recently including adding indexes and limiting the amount of data held. It would probably benefit you to do a search for MySQL in the email list archives. Cheers, Jim -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null