We have two Nagios servers each for monitoring different networks. The production network has over 1200 service checks and the average host check time is around 4 seconds:
Host Check Execution Time: 4.03 / 4.15 / 4.039 sec The UAT network has only 120 checks. For some reason, starting yesterday we have seen a huge spike in the average Host Check Execution Time: Host Check Execution Time: 4.03 / 24.09 / 16.236 sec This is causing all sorts of false alarms. I tried to log onto the server and run some checks from the command line and indeed, the check_ping plugin runs really, really slow. The odd thing is that if I just do a standard "ping hostname" it's nice and fast. We have not changed or updated anything on this Nagios server, nor are we seeing any kind of elevated CPU usage. Has anyone else experienced anything like this? I'm not sure where to look to start troubleshooting the problem. ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null