On Mon, Feb 9, 2009 at 3:58 PM, Max <[email protected]> wrote:

> Rahul,
>
> On Mon, Feb 9, 2009 at 2:53 PM, Rahul Nabar <[email protected]> wrote:
> > Thanks Marc. I have: max_concurrent_checks=0
>
> Our experience has been that with max_concurrent_checks set to 0 and
> inter-check delay and nagios sleep set very low we get high reported
> service check latencies as we are basically asking Nagios to try and
> run everything as soon as possible ... 1000s of checks over a few
> seconds in essence ... which it can't do.   As far as 'real life'
> negative impact the high latency in this singular case hasn't meant
> much; it initially really worried me until i realized that the high
> service latency is just happening because we are basically telling
> nagios to pause / sleep / wait for as little time as possible and run
> things as quickly as possible.  We have around a 146 second service
> check latency but from our detailed Nagios metrics we see that check
> runs are completing in right around 4 minutes, under our 5 minute
> hard-ceiling (around 6000 checks).  our PNP performance graphs prove
> our suspicions .. our reporting server receives 6000 metrics in 4
> minutes or less and we have no gaps in our graphs or major under or
> over sampling problems with the data we retrieve from our remote
> agents.
>
> I only bring that up because if you not only have
> max_concurrent_checks set to 0 but also have tuned way down
> inter-check delay settings and sleep time you might be encountering
> the same situation and the high latency might not be something to
> worry about .. but only IF you have all your delays tuned very low and
> no ceiling on max checks.  for any other situation it is definitely
> something to investigate.



Thanks Max. That is a pretty intricate issue that I had no idea about! I'm
still trying to figure out the exact implications of what you describe.
Maybe I need to visit the Nagios manual again to re-read nagios's scheduling
logic. It's especially important to me now that I also have PnP running
performance stats.

Meanwhile this is a dump of the relevant parameters you speak about. I don't
recall changing any from their defaults.
Maybe I ought to in the light of what you mentioned?

service_inter_check_delay_method=s
host_inter_check_delay_method=s
sleep_time=0.25

#Timeouts:
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5

-- 
Rahul
------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to