Sorry for warming up this old thread, but we got this problem last week as well. We've got a freshly patched 3.7.1 and after adding a bunch of checks, one slave got problems an all ~1300 checks generated problems. Changing max_concurrent_checks to 250 and restarting the services on the slave solved the problem. As we start adding more checks, this is bound to happen again. Are there any signs that we're approaching some limit or is it trial&error? (Yes, it's planned to have the slave setup in a cluster, but then again, it would be good to know if we're approaching some limit).
So, it seems that it's not solved with the most current release. I've seen that the fix does what it's supposed to do: reschedule the checks at a later time. However, it seems that the system was loaded so much that it didn't help either, it just kept rescheduling the checks, which resulted in all results to become stale. regards, arthur _______________________________________________ Opsview-users mailing list Opsview-users@lists.opsview.org http://lists.opsview.org/lists/listinfo/opsview-users