> -----Original Message----- > From: Jason Qualkenbush [mailto:[EMAIL PROTECTED] > Sent: 25 April 2007 11:55 > > Wheeler, JF (Jonathan) wrote: > > As I have reported in the past I have 2 slave servers and a master > > server; all checks should be run from the slave servers and passed back > > to the master server. I have been recently trying the understand why > > the master server still has kernel "Out of memory" problems such that > > the kernel starts killing active processes and, in some cases, panics > > because there are no more processes to kill (this happens perhaps once > > or twice per week usually around 4:50 - 5:10 in the morning). As part > > of my investigations I have noticed that for a typical host 40% of tests > > are reported from the slave and 60% are run by the master. I can tell > > this because 40% of messages for this typical host in /var/log/nagios on > > the master server begin "EXTERNAL_COMMAND" and 60% of messages begin > > "Warning:". My question is why this should be ? Here is a copy of > > nagios.log from the master server for one test of one host for today (so > > far): > > Sounds like this has to do more with the freshness of the passive > check. If the master server thinks the check isn't fresh, it will then > run an active check to see for itself. I'd tune in the freshness, and
> keep in mind the scheduling of the checks. If you configure your > freshness to expire at five minutes, and the slave server schedules that > check for once every six minutes, you are going to get behaviour like you > mentioned. Thanks for your reply. However the tests are scheduled to run every 30 minutes on both master and slave servers (confirmed by checking in retention.dat file). If you look in the original message you will see that the master server is correctly running the command by freshness checking ("Warning" messages) every 30 minutes, but the slave results are at longer intervals ("EXTERNAL" messages) though roughly at some number of 30 minute intervals. What are the possibilities for results from command issued by the slave getting lost ? Why are OK results not recorded in the slave server logs ? Jonathan Wheeler e-Science Centre Rutherford Appleton Laboratory ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null