Hi,
I have a problem with the service dependencies. I configured service
dependencies between two services on the same host. When the master is
critical, unknown or warning Icinga should not check the dependent service,
but it is still checking the dependent service when the master is c,u or w
Checking the debug, i see a EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK on
the service that should not been checked, i didn't configure any schedule
forced check
on icinga.cfg i got this configuration:
soft_state_dependencies=1
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
Here is the host configuration, services configuration, logs and debugs
define servicedependency {
execution_failure_criteria w,c,u
notification_failure_criteria w,c,u
dependency_period 24x7
service_description check_ping
dependent_service_description check_ssh
host_name wal_rt1
dependent_host_name wal_rt1
}
on icinga web i can see this
Service
check_ssh
Service Dependencies
check_ping on wal_rt1
define host {
host_name wal_rt1
alias Router 1
address 192.168.1.1
check_command check-host-alive
active_checks_enabled 1
passive_checks_enabled 1
notifications_enabled 1
check_period 24x7
notification_period 24x7
contact_groups NOC-MAJOR,admins
use bts-host
}
define service {
service_description check_ping
check_command
check_ping!20,80%!30,90%
host_name wal_rt1
contact_groups admins,NOC-MAJOR
active_checks_enabled 1
passive_checks_enabled 1
notifications_enabled 1
check_freshness 0
use noc-service
}
define service {
service_description check_ssh
check_command check_ssh!
host_name wal_rt1
contact_groups admins,NOC-MAJOR
active_checks_enabled 1
passive_checks_enabled 1
notifications_enabled 1
check_freshness 0
use noc-service
}
DEBUG
[1321010091.644650] [016.1] [pid=580] Checking host 'wal_rt1' for
flapping...
[1321010091.661693] [016.1] [pid=580] Checking service 'check_cisco_cpu' on
host 'wal_rt1' for flapping...
[1321010091.661782] [016.1] [pid=580] Checking service 'check_ping' on host
'wal_rt1' for flapping...
[1321010091.661865] [016.1] [pid=580] Checking service 'check_ssh' on host
'wal_rt1' for flapping...
[1321010091.661944] [016.1] [pid=580] Checking service 'check_temperature'
on host 'wal_rt1' for flapping...
[1321010132.062541] [016.0] [pid=580] Attempting to run scheduled check of
service 'check_ssh' on host 'wal_rt1': check options=0, latency=0.062000
[1321010132.062593] [016.0] [pid=580] Scheduling a non-forced, active check
of service 'check_ssh' on host 'wal_rt1' @ Fri Nov 11 12:16:32 2011
[1321010192.053859] [016.0] [pid=580] Attempting to run scheduled check of
service 'check_ssh' on host 'wal_rt1': check options=0, latency=0.053000
[1321010192.053906] [016.0] [pid=580] Scheduling a non-forced, active check
of service 'check_ssh' on host 'wal_rt1' @ Fri Nov 11 12:17:32 2011
[1321010218.176479] [024.1] [pid=580] Run a few checks before executing a
host check for 'wal_rt1'.
[1321010218.176534] [016.0] [pid=580] Attempting to run scheduled check of
host 'wal_rt1': check options=0, latency=0.176000
[1321010218.176549] [016.0] [pid=580] ** Running async check of host
'wal_rt1'...
[1321010218.176577] [016.0] [pid=580] Checking host 'wal_rt1'...
[1321010231.032296] [016.1] [pid=580] Handling check result for host
'wal_rt1'...
[1321010231.032305] [016.1] [pid=580] ** Handling async check result for
host 'wal_rt1'...
[1321010231.032356] [016.1] [pid=580] HOST: wal_rt1, ATTEMPT=1/10, CHECK
TYPE=ACTIVE, STATE TYPE=HARD, OLD STATE=0, NEW STATE=0
[1321010231.032381] [016.1] [pid=580] Pre-handle_host_state() Host:
wal_rt1, Attempt=1/10, Type=HARD, Final State=0
[1321010231.032389] [016.1] [pid=580] Post-handle_host_state() Host:
wal_rt1, Attempt=1/10, Type=HARD, Final State=0
[1321010231.032397] [016.1] [pid=580] Checking host 'wal_rt1' for
flapping...
[1321010231.032443] [016.0] [pid=580] Scheduling a non-forced, active check
of host 'wal_rt1' @ Fri Nov 11 12:22:11 2011
[1321010231.032485] [016.1] [pid=580] ** Async check result for host
'wal_rt1' handled: new state=0
LOGS
[1320966000] CURRENT HOST STATE: wal_rt1;UP;HARD;1;PING OK - Packet loss =
0%, RTA = 47.22 ms
[1320966000] CURRENT SERVICE STATE:
wal_rt1;check_cisco_cpu;UNKNOWN;HARD;3;/usr/local/icinga/libexec/check_snmp:
option requires an argument -- 'C'
[1320966000] CURRENT SERVICE STATE: wal_rt1;check_ping;CRITICAL;HARD;3;PING
CRITICAL - Packet loss = 0%, RTA = 47.27 ms
[1320966000] CURRENT SERVICE STATE:
wal_rt1;check_ssh;CRITICAL;SOFT;2;Connection refused
[1320966000] CURRENT SERVICE STATE:
wal_rt1;check_temperature;UNKNOWN;HARD;3;Usage:
[1321000635] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_temperature;1321000633
[1321000635] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_ping;1321000633
[1321000635] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_cisco_cpu;1321000633
[1321000635] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_ssh;1321000633
[1321000635] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;wal_rt1;1321000633
[1321000641] SERVICE ALERT: wal_rt1;check_ssh;CRITICAL;HARD;3;Connection
refused
[1321000687] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;wal_rt1;check_ping;0;ok|
[1321000687] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;wal_rt1;check_ssh;0;ok|
[1321000691] PASSIVE SERVICE CHECK: wal_rt1;check_ping;0;ok
[1321000691] SERVICE ALERT: wal_rt1;check_ping;OK;HARD;3;ok
[1321000691] PASSIVE SERVICE CHECK: wal_rt1;check_ssh;0;ok
[1321000691] SERVICE ALERT: wal_rt1;check_ssh;OK;HARD;3;ok
[1321000761] Warning: Service 'check_cisco_cpu' on host 'wal_rt1' has no
check time period defined!
[1321000761] Warning: Service 'check_ping' on host 'wal_rt1' has no check
time period defined!
[1321000761] Warning: Service 'check_ssh' on host 'wal_rt1' has no check
time period defined!
[1321000761] Warning: Service 'check_temperature' on host 'wal_rt1' has no
check time period defined!
[1321001542] SERVICE ALERT: wal_rt1;check_ssh;CRITICAL;SOFT;1;Connection
refused
[1321001542] SERVICE ALERT: wal_rt1;check_ping;CRITICAL;SOFT;1;PING
CRITICAL - Packet loss = 0%, RTA = 45.50 ms
[1321001602] SERVICE ALERT: wal_rt1;check_ssh;CRITICAL;SOFT;2;Connection
refused
[1321001602] SERVICE ALERT: wal_rt1;check_ping;CRITICAL;SOFT;2;PING
CRITICAL - Packet loss = 0%, RTA = 47.12 ms
[1321001662] SERVICE ALERT: wal_rt1;check_ssh;CRITICAL;HARD;3;Connection
refused
[1321001662] SERVICE FLAPPING ALERT: wal_rt1;check_ssh;STARTED; Service
appears to have started flapping (24.2% change >= 20.0% threshold)
[1321001662] SERVICE ALERT: wal_rt1;check_ping;CRITICAL;HARD;3;PING
CRITICAL - Packet loss = 0%, RTA = 48.17 ms
[1321001721] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_cisco_cpu;1321001719
[1321001721] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_ssh;1321001719
[1321001721] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_ping;1321001719
[1321001722] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_temperature;1321001719
[1321001722] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;wal_rt1;1321001719
[1321001852] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_ssh;1321001851
[1321001852] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_ping;1321001851
[1321001852] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_cisco_cpu;1321001851
[1321001852] EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;wal_rt1;check_temperature;1321001851
[1321001853] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;wal_rt1;1321001851
[1321001868] Warning: Service 'check_cisco_cpu' on host 'wal_rt1' has no
check time period defined!
[1321001868] Warning: Service 'check_ping' on host 'wal_rt1' has no check
time period defined!
[1321001868] Warning: Service 'check_ssh' on host 'wal_rt1' has no check
time period defined!
[1321001868] Warning: Service 'check_temperature' on host 'wal_rt1' has no
check time period defined!
[1321001868] SERVICE FLAPPING ALERT: wal_rt1;check_ssh;STARTED; Service
appears to have started flapping (23.2% change >= 20.0% threshold)
[1321001913] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;wal_rt1;check_ping;0;OK|OK
[1321001913] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;wal_rt1;check_ssh;0;OK|OK
[1321001918] PASSIVE SERVICE CHECK: wal_rt1;check_ping;0;OK
[1321001918] SERVICE ALERT: wal_rt1;check_ping;OK;HARD;3;OK
[1321001918] PASSIVE SERVICE CHECK: wal_rt1;check_ssh;0;OK
[1321001918] SERVICE ALERT: wal_rt1;check_ssh;OK;HARD;3;OK
[1321002758] SERVICE ALERT: wal_rt1;check_ssh;CRITICAL;SOFT;1;Connection
refused
[1321002758] SERVICE ALERT: wal_rt1;check_ping;CRITICAL;SOFT;1;PING
CRITICAL - Packet loss = 0%, RTA = 62.12 ms
[1321002818] SERVICE ALERT: wal_rt1;check_ping;CRITICAL;SOFT;2;PING
CRITICAL - Packet loss = 0%, RTA = 46.04 ms
[1321002878] SERVICE ALERT: wal_rt1;check_ping;CRITICAL;HARD;3;PING
CRITICAL - Packet loss = 0%, RTA = 47.47 ms
[1321002878] SERVICE FLAPPING ALERT: wal_rt1;check_ping;STARTED; Service
appears to have started flapping (23.7% change >= 20.0% threshold)
[1321010091] Warning: Service 'check_cisco_cpu' on host 'wal_rt1' has no
check time period defined!
[1321010091] Warning: Service 'check_ping' on host 'wal_rt1' has no check
time period defined!
[1321010091] Warning: Service 'check_ssh' on host 'wal_rt1' has no check
time period defined!
[1321010091] Warning: Service 'check_temperature' on host 'wal_rt1' has no
check time period defined!
Please help me with this problem.
Thanks and Regards,
Daniel
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users