Sorry for warming up this old thread, but we got this problem last
week as well. We've got a freshly patched 3.7.1 and after adding a
bunch of checks, one slave got problems an all ~1300 checks generated
problems. Changing max_concurrent_checks to 250 and restarting the
services on the slave solved the problem. As we start adding more
checks, this is bound to happen again. Are there any signs that we're
approaching some limit or is it trial&error? (Yes, it's planned to
have the slave setup in a cluster, but then again, it would be good to
know if we're approaching some limit).

So, it seems that it's not solved with the most current release. I've
seen that the fix does what it's supposed to do: reschedule the checks
at a later time. However, it seems that the system was loaded so much
that it didn't help either, it just kept rescheduling the checks,
which resulted in all results to become stale.

regards,
arthur
_______________________________________________
Opsview-users mailing list
Opsview-users@lists.opsview.org
http://lists.opsview.org/lists/listinfo/opsview-users

Reply via email to