Hi list, i recently ran into problems with the service scheduling inside my nagios installation and i guess i need some help. I'm running Nagios 2.5 with roundabout 300 Hosts ans 2800 services on a 4-CPU Xeon machine with 2 GB RAM. Services are actively monitored.
Thing is: I changed the configuration last week to better fit our needs. That included a lot of renaming of Services, Contacts, Contactgroups and Escalations. After I finished, i restarted the nagios daemon yesterday morning at about 9am. Result: the process doesn't start monitoring. I looked into the scheduling queue, and it told me it will start the monitoring at 5pm in the evening. Over the day, i tried to analyse the problem. I reviewed the config, although Nagios verificates it to be good, finding nothing. Restart of the process has no effect, the scheduling queue doesn't change. I tried with the old config (praise svn!), and the Process starts as usual, generating a new scheduling queue and beginning the monitoring. As the only file influencing the schedule queue is the main config file and altough I did not change that, i copied it again from the old to the new config. It didn't show any effect, at least it shows the error seems to live in the host/service/escalation area of my config. When i restarted the Nagios daemon at 4:50pm, waiting till 5pm, i began to monitor as he was expected to. That worked till today morning, when at around 9am (again!) the scheduling queue showed up with the 5pm-thing again. I recompiled nagios with DEBUG1-3 to get some more information. After validating te config, it shows the following: [...] Completed service verification checks Completed host verification checks Completed hostgroup verification checks Completed servicegroup verification checks Completed contact verification checks Completed contact group verification checks Completed service escalation checks Completed service dependency checks Completed host escalation checks Completed host dependency checks Completed command checks Completed command checks Completed extended host info checks Completed extended service info checks Completed circular path checks Completed circular host and service dependency checks Completed global event handler command checks Completed obsessive compulsive processor command checks $0: Cannot enter daemon mode with DEBUG option(s) enabled. We'll run as a foreground process instead... COMMAND FILE THREAD: 1077427120 Preferred Time: 1156839911 --> Tue Aug 29 10:25:11 2006 Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006 Preferred Time: 1156839911 --> Tue Aug 29 10:25:11 2006 Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006 [...] Host 'AP001' should not be scheduled Host 'AP002' should not be scheduled Host 'AP003' should not be scheduled Host 'AP004' should not be scheduled Host 'AP005' should not be scheduled [...] Total scheduled services: 2837 Service Interleave factor: 1 Total service interleave blocks: 2837 Service inter-check delay: 1.0 Current Interleave Block: 0 Service 'Network: Ping' on host 'AP001' CIB: 0, IBI: 1, TIB: 2837, SIF: 1 Mult factor: 2837 Preferred Check Time: 1156842748 --> Tue Aug 29 11:12:28 2006 Preferred Time: 1156842748 --> Tue Aug 29 11:12:28 2006 Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006 Actual Check Time: 1156863600 --> Tue Aug 29 17:00:00 2006 [...] As you can see, i also changed the service interleaving from smart to dumb with an interleave factor of 1 to cirumvent the scheduling logic. In vain, i guess.... :-( Now, for my questions: Has anyone seen such behaviour already? Where is that "Next Valid Time" in the Debug-Output from? How is it generated? Is there any tool beside the daemon itself to validate the config files? Thanks for reading all the way down here and please excuse any lingual errors. Regards, Marcus Fleige -- EOF ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null