Re: [opsview-users] Max concurrent service checks have been reached

Ton Voon Tue, 25 May 2010 01:24:34 -0700


On 21 May 2010, at 18:03, Andrew Hall wrote:

We keep seeing a lot of stale data in the master GUI for services
checked by a remote slave.

When I check the nagios log on this slave I see lots of these...

"Max concurrent service checks (50) has been reached.  Delaying
further checks until previous checks are complete..."

I've been through the hosts and can't spot anything which seems amiss.

It's monitoring 47 hosts and 510 services, and none of those service
checks has a check interval with a frequency below 5 minutes.

Can anyone advise how I could begin to troubleshoot this ?

This was a bug in Nagios which we saw around Opsview 3.1. This hasbeen pushed back upstream into Nagios already - Nagios 3.2.0 frommemory.

It looks like we've applied the patch to the 3.0 branch, but as we'renot maintaining that anymore, that's not likely to get released. To behonest, we're moving at a cracking pace to keep adding features thatpeople want to see in Opsview and we can't afford to maintain olderversions. However, if you take out a subscription with us, then we canmake some exceptions :)

Or - if the box isn't particularly overloaded - how I could increase
this value ?

The specific bug is that when max_concurrent_checks is reached, Nagiosjust schedules everything again at the same time at the next checkinterval. Our fix to Nagios was to push the check a random number ofseconds ahead, to get more spread in the timings. As your Nagiosdoesn't have this extra logic, you can just disable themax_concurrent_checks by setting it to 0.


http://docs.opsview.org/doku.php?id=opsview-community:configuration_files#overrides

Ton

_______________________________________________
Opsview-users mailing list
Opsview-users@lists.opsview.org
http://lists.opsview.org/lists/listinfo/opsview-users

Re: [opsview-users] Max concurrent service checks have been reached

Reply via email to