I had a similar problem with passive checks and freshness checking when I upgraded to 2.0. I am guessing (though no one ever responded when I asked the list before) That they changed the logic. Before a stale check would trigger an active check in all cases regardless of the check period. Now it appears that check_period overrides that.
 
ie. Whereas
check_period = none
active_check_enabled = 1
 
used to work, now you have to use
 
check_period = 24x7
active_checks_enabled = 0
 
Which is a little annoying, in that the gui now puts the "pasv" icon next to all your passive checks, and the tac cgi wont consider a passive failed check to be an unhandled problem by default. This is a bug in my opinion, though no one seems to agree with me. (assuming silence indicates disagreement.)
 
Anyway. below is the config i use..
 
define service{
            name                           audit-service-tmpl ; template for Passive commands (like Replication Check Services
)
            is_volatile                    1                    ; notify with every failure (with freshness checks)
            check_period                   24x7                
            max_check_attempts             1
            normal_check_interval          1
            retry_check_interval           1
            retain_status_information      1
            retain_nonstatus_information   1                    ;
            passive_checks_enabled         1                    ; these are all passive checks
            active_checks_enabled          0                    ; these are all passive checks
            flap_detection_enabled         0                    ; don't want flaps detected in these cases.
            notification_interval          31536000
            notification_period            24x7
            notification_options           w,u,c,r
            check_freshness                1
            register                       0
}
define service{
        service_description          rhost_check
        host_name                    audit
        contact_groups               unix-admins
        check_command                check-freshness-stale!"rhost audit did not report on time"     ; dont forget to give script details
        freshness_threshold          90000     ; 25 hours  (seconds) make this number be a couple of hours after next job should complete
        use                          audit-service-tmpl
}


From: Cott Lang [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 03, 2006 10:30 PM
To: [email protected]
Subject: [Nagios-users] 2.0 upgrade, passive checks problem

I've had passive checks working for several years in 1.x, with never a hitch.

Unfortunately, that all changed when I upgraded to 2.0.   Suddenly, freshness checks never occurred. Ever.

I've re-read the docs several times, my config seems okay. My first problem seems to be that my "check_period" under 1.x was always "none", which worked fine.

If I change it to "24x7", I start getting freshness checks. However, they seem to totally ignore freshness_threshold and use the normal_check_interval.  If I comment out freshness_threshold and define normal_check_interval to what I want, I seem to get random values.

i.e., a service set to 10 minutes tells me this:

[1136344804] Warning: The results of service 'x' on host y are stale by 12 seconds (threshold=821 seconds).  I'm forcing an immediate check of the service.

Where'd 821 seconds come from?

Worse, I have other services with nearly identical definitions that don't indicate they are stale or that a freshness check is being scheduled, but suddenly go critical:

[1136344634] SERVICE ALERT: host;service;CRITICAL;HARD;1;CRITICAL: service success not reported

The normal_check_interval is set to 2 hours, but it seems to go critical every ~10-15 minutes.

I'm at a loss at this point, I can only "kinda" get passive checks working. It seems like I must be missing something obvious here in the 2.0 upgrade, but I'm befuddled.  I've been using a template for all my passive services like this:


define service {
  name                          passive-service
  active_checks_enabled         0       ; Active service checks are enabled
  passive_checks_enabled        1       ; Passive service checks are enabled/accepted
  parallelize_check             1       ; Active service checks should be parallelized
  obsess_over_service           1       ; We should obsess over this service (if necessary)
  check_freshness               1       ; Default is to NOT check service 'freshness'
  notifications_enabled         1       ; Service notifications are enabled
  event_handler_enabled         1       ; Service event handler is enabled
  flap_detection_enabled        1       ; Flap detection is enabled
  process_perf_data             1       ; Process performance data
  retain_status_information     1       ; Retain status information across program restarts
  retain_nonstatus_information  1       ; Retain non-status information across program restarts
  max_check_attempts            1
  normal_check_interval         1560    ; 26 hours
  retry_check_interval          1
  is_volatile                   0
  check_period                  24x7
  notification_interval         15
  notification_period           24x7
  notification_options          w,c,r
  ; freshness_threshold         93600   ; 26 hours  appears useless!
  register                      0
}

Any help would be appreciated!

thanks!





Reply via email to