Hello all, I'm trying to setup a distributed monitoring system. At the start all looked fine too me, but now I'm having some problems on not receiving all passive checks from other hosts.
The machine is a Intel(R) Xeon(TM) CPU 2.40GHz system with 512 MB RAM. The load is minimal. The only strange thing I can see is the memory settings: nagios:/etc/nagios # cat /proc/meminfo MemTotal: 514264 kB MemFree: 30192 kB Buffers: 44568 kB Cached: 328004 kB SwapCached: 8 kB Active: 264908 kB Inactive: 137824 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 514264 kB LowFree: 30192 kB SwapTotal: 1028120 kB SwapFree: 1028020 kB Dirty: 780 kB Writeback: 0 kB Mapped: 46188 kB Slab: 75556 kB Committed_AS: 100992 kB PageTables: 1104 kB VmallocTotal: 507896 kB VmallocUsed: 7264 kB VmallocChunk: 499760 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB The process info tells me this: Time Frame Checks Completed <= 1 minute: 51 (16.6%) <= 5 minutes: 221 (71.8%) <= 15 minutes: 255 (82.8%) <= 1 hour: 260 (84.4%) Since program start: 261 (84.7%) So it's receiving less then 85% of all checks :( There will be more passive checks to be send to this nagios server. Do we need other hardware ? Where do I need to look to solve this problem ? The machines sending the passive check info are not too busy doing this, the checks are seperated over three different servers. One example... This is /var/log/nagios/nagios.log: [1135162484] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;cat29-w11-backup;PING;0;PING OK - Packet loss = 0%, RTA = 0.89 ms[1135162491] SERVICE ALERT: cat29-w11-backup;PING;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 0.89 ms [1135162491] SERVICE NOTIFICATION: nagios;cat29-w11-backup;PING;OK;notify-by-epager;PING OK - Packet loss = 0%, RTA = 0.89 ms[1135162491] SERVICE NOTIFICATION: nagios;cat29-w11-backup;PING;OK;notify-by-email;PING OK - Packet loss = 0%, RTA = 0.89 ms [1135162941] Warning: The results of service 'PING' on host 'cat29-w11-backup' are stale by 32 seconds (threshold=425 seconds). I'm forcing an immediate check of the service. [1135162951] SERVICE ALERT: cat29-w11-backup;PING;CRITICAL;SOFT;1;CRITICAL: Service results are stale! It looks like its stale again too fast ? Can somebody please help me :) Best regards, Rob Hassing ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null