On Mar 18, 2010, at 9:53 AM, David Dyer-Bennet wrote:

> I'm monitoring some far-away remote hosts, that we connect to via the
> public internet (well, there's an encrypted VPN involved).  I'm trying not
> to send notifications until an outage persists for a while.
> 
> In an example I looked at this morning, I see that it was repeating the
> host check every 10 seconds until it hit the retry count.

With nagios-2, when a host enters a non-OK state, nagios switches to a serial 
mode where the host check is retried until max_check_attempts is reached. The 
fact that it's 10 seconds has more to do with the check that you are performing 
than any nagios timing. That's apparently about how long it takes each check 
run to complete (are you issuing 10 pings perhaps?).

> Where does that 10 seconds time come from?  The manual is remarkably vague
> about host check scheduling; about all it says is that it does them on
> demand,

With nagios-2, host checks are only run when a service on the host fails. 
Nagios will then run the host's check_command, up to max_check_attempts, to 
determine if the host is down or just the service.

> and "If the first host check returns a non-OK state, Nagios will
> keep pounding out checks of the host until either (a) the maximum number
> of host checks (specified by the max_attempts option in the host
> definition) is reached or (b) a host check results in an OK state."

Yup, the host check_command is run, one immediately after the previous, until 
max_check_attempts is reached. During this time nagios is doing *nothing* else 
besides checking this host.

> Does this mean I have no control over the timing?  

Depends on your version of nagios. With nagios-2, yes, you have no control over 
the timing.

> Can I treat the 10 second observed delay as real (and then control total time 
> delay by
> setting max_attempts high)?

Setting max_attempts is a way to deal with that but if you're using nagios-2, 
you're stopping *all* other checks and processing until max_attempts is 
reached. If you set it so that it's 6 then for about the next 60 seconds nagios 
is doing nothing besides checking this single host. This may be important to 
you if you have lots of other checks you are doing.

If you need to have more control over that then I'd suggest upgrading to 
nagios-3. Host check logic was greatly improved and more in line with how 
service checks are done.

--
Marc


------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to