I've noticed we get this problem when there are more than one or two hosts down. Because Nagios (we use 1.2) does host checks first, and sequentially, a host check timing out can hold up everything else (we have >3000 checks to run every 5 minutes).
I have no hosts down 95% of the time, including now. I could see how that would be an issue, though. I have turned off all logging, state retention, performance data handling and backed off all timing parameters to their defaults (or even less aggressive timings). In a separate test, I changed only the command_check_interval from -1 (check as often as possible) to 10 seconds. Both have had seemingly no effect. At this point, they 2 main servers I am looking at have been running for 30 minutes and latencies are up to 540 seconds for the "bad" one and 48 sec for the other one. My next step will be to recompile with the latest nagios and try that. If that doesn't show an improvement, I'll try w/o perlcache. Lastly, I'll try without the embedded perl interpretter at all. ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642 _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
