I had a similar problem with passive checks and freshness
checking when I upgraded to 2.0. I am guessing (though no one ever responded
when I asked the list before) That they changed the logic. Before a stale check
would trigger an active check in all cases regardless of the check period. Now
it appears that check_period overrides that.
ie. Whereas
check_period = none
active_check_enabled = 1
used to work, now you have to use
check_period = 24x7
active_checks_enabled = 0
Which is a little annoying, in that the gui now puts the
"pasv" icon next to all your passive checks, and the tac cgi wont consider a
passive failed check to be an unhandled problem by default. This is a bug in my
opinion, though no one seems to agree with me. (assuming silence indicates
disagreement.)
Anyway. below is the config i use..
define
service{
name audit-service-tmpl ; template for Passive commands (like Replication Check Services
)
is_volatile 1 ; notify with every failure (with freshness checks)
check_period 24x7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
retain_status_information 1
retain_nonstatus_information 1 ;
passive_checks_enabled 1 ; these are all passive checks
active_checks_enabled 0 ; these are all passive checks
flap_detection_enabled 0 ; don't want flaps detected in these cases.
notification_interval 31536000
notification_period 24x7
notification_options w,u,c,r
check_freshness 1
register 0
}
name audit-service-tmpl ; template for Passive commands (like Replication Check Services
)
is_volatile 1 ; notify with every failure (with freshness checks)
check_period 24x7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
retain_status_information 1
retain_nonstatus_information 1 ;
passive_checks_enabled 1 ; these are all passive checks
active_checks_enabled 0 ; these are all passive checks
flap_detection_enabled 0 ; don't want flaps detected in these cases.
notification_interval 31536000
notification_period 24x7
notification_options w,u,c,r
check_freshness 1
register 0
}
define
service{
service_description rhost_check
host_name audit
contact_groups unix-admins
check_command check-freshness-stale!"rhost audit did not report on time" ; dont forget to give script details
freshness_threshold 90000 ; 25 hours (seconds) make this number be a couple of hours after next job should complete
use audit-service-tmpl
}
service_description rhost_check
host_name audit
contact_groups unix-admins
check_command check-freshness-stale!"rhost audit did not report on time" ; dont forget to give script details
freshness_threshold 90000 ; 25 hours (seconds) make this number be a couple of hours after next job should complete
use audit-service-tmpl
}
I've had passive checks working for several years in 1.x, with never a hitch.
From: Cott Lang [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 03, 2006 10:30 PM
To: [email protected]
Subject: [Nagios-users] 2.0 upgrade, passive checks problem
Unfortunately, that all changed when I upgraded to 2.0. Suddenly, freshness checks never occurred. Ever.
I've re-read the docs several times, my config seems okay. My first problem seems to be that my "check_period" under 1.x was always "none", which worked fine.
If I change it to "24x7", I start getting freshness checks. However, they seem to totally ignore freshness_threshold and use the normal_check_interval. If I comment out freshness_threshold and define normal_check_interval to what I want, I seem to get random values.
i.e., a service set to 10 minutes tells me this:
[1136344804] Warning: The results of service 'x' on host y are stale by 12 seconds (threshold=821 seconds). I'm forcing an immediate check of the service.
Where'd 821 seconds come from?
Worse, I have other services with nearly identical definitions that don't indicate they are stale or that a freshness check is being scheduled, but suddenly go critical:
[1136344634] SERVICE ALERT: host;service;CRITICAL;HARD;1;CRITICAL: service success not reported
The normal_check_interval is set to 2 hours, but it seems to go critical every ~10-15 minutes.
I'm at a loss at this point, I can only "kinda" get passive checks working. It seems like I must be missing something obvious here in the 2.0 upgrade, but I'm befuddled. I've been using a template for all my passive services like this:
define service {
name passive-service
active_checks_enabled 0 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 1 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
max_check_attempts 1
normal_check_interval 1560 ; 26 hours
retry_check_interval 1
is_volatile 0
check_period 24x7
notification_interval 15
notification_period 24x7
notification_options w,c,r
; freshness_threshold 93600 ; 26 hours appears useless!
register 0
}
Any help would be appreciated!
thanks!
