Was your nagios interface still showing whatever the service status was from the last time it ran the checks as though they were current?
I just joined this list over the same problem with 3.2.0. Nagios was showing all green (got 95 services across 12 hosts), then when I went to demo it to my boss who just got back from vacation, I killed a box, and was surprised to see Nagios continue showing green. When I looked in detail I saw that the checks had actually not been running in days... I restarted Nagios a number of times to no avail. I finally echoed a force check into the nagios.cmd file for a host, and got a lot of messages about how the services seemed orphaned. I do have the check for orphans config option enabled (99% of my nagios.cfg is default). I was finally able to get Nagios fixed by stopping it, removing retention.dat, and starting it again. But I don't really want to disable retention unless I have to... I was thinking this might have something to do with my failover setup (though I don't see why). I have two boxes in a Heartbeat+DRBD configuration, with Nagios (and all it's configuration, /var files, etc.) on the DRBD partition. It /seems/ to failover just fine. After seeing this, I tested failing over back and forth about a dozen times and Nagios did not seem to get hung up in the same way, so I don't understand what caused this. Maybe we have the same problem? On Nov 2, 2009, at 9:08 PM, Les Fenison wrote: > I had nagios working great. Checking 6 hosts and about 85 > services. Then suddenly, all services on all hosts except one > stopped checking. The next scheduled check is about 24 hours from > the last check. I had been checking every 5 minutes. > > Restarting nagios didn't help. I am using a gui NagioSQL to edit > my configuration files so I suspect it did something to me but I > have no clue where to look except where I have already looked. > > What can cause nagios to just stop checking everything like that or > to randomly switch to every 24 hours rather than the configured > every 5 minutes? > > I am having to manually do force checks to get it to check. > > Here are some things I have checked... > > Hosts check_interval is 5, retry_interval is 1 > Services check_interval is 10, retry_interval is 2 > > So where could Nagios be getting the idea that it is suppose to be > every 24 hours? -- Casey Allen Shobe ca...@shobe.info ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null