Hi everybody,

I'm looking at an issue with notifications and I'm unsure whether this
is working as designed or not.

I'm getting service notifications when a service that has been in that
state for a long time changes from WARNING; HARD to CRITICAL;HARD
after one check because of a check timeout.
Three seconds later, the host check returns with DOWN, SOFT, yet only
once, so the host never gets to DOWN, HARD.

I thought that if the host is down, no service notifications will be
sent.  http://docs.icinga.org/latest/en/checkscheduling.html#hostcheckscheduling
actually states that "when Icinga is check [sic!] the status of a
host, it holds off on doing anything else"  - so I would expect it to
also not send the service notification I'm seeing until it's sure what
the host status is :/

The Log with comments is here:

# 1. Status - of dbserver;Disk_E is WARNING;HARD and has been so for a
while(also see the last line in this log)


Dec 11 23:05:34 icinga_server icinga: SERVICE ALERT:
db_server;Disk_E;CRITICAL;HARD;3;CRITICAL - Socket timeout after 10
seconds
# 2. When we get a Critical for Disk_E because of the timeout, the
status goes to Critical, HARD which conforms to
http://docs.icinga.org/latest/en/statetypes.html - 5.8.4 and 5.8.5

# 3. If I understand
http://docs.icinga.org/latest/en/checkscheduling.html#hostcheckscheduling
correctly, on every service state change, icinga will do a check of
the host, to see if its status changed as well. So in this case, a
host check should be performed before any further action is taken.
What actually happens is that the result is processed and a service
notification is send out immediately
Dec 11 23:05:34 icinga_server icinga: SERVICE NOTIFICATION:
prio1;db_server;Disk_E;CRITICAL;notify_service_email_24x7;CRITICAL -
Socket timeout after 10 seconds

# 4. Only a few seconds afterswards does icinga show new results for
the host state, but the are still SOFT.
Dec 11 23:05:37 icinga_server icinga: HOST ALERT:
db_server;DOWN;SOFT;1;CRITICAL - Host Unreachable (172.16.28.132)

# 5. The host is reachable again.
Dec 11 23:08:44 icinga_server icinga: HOST ALERT:
db_server;UP;SOFT;2;PING OK - Packet loss = 0%, RTA = 45.00 ms

# 6. Service status goes back to Warning.
Dec 11 23:20:24 icinga_server icinga: SERVICE ALERT:
db_server;Disk_E;WARNING;HARD;3;e:\ - total: 180.00 Gb - used: 163.08
Gb (91%) - free 16.91 Gb (9%)


So I'm wondering: is sending notifications on this described change
from Warning -> Critical
a) the correct behavior or
b) should icinga not send this service notification because the host
is DOWN and the service state can therefore not be determined. (That
would mean, that icinga would wait until it confirmed the host as
down, not send any service notifications, but send Host notifications
if the Host would go to DOWN, HARD. In the case above where the host
comes back before, no notifications would be send at all.)

Cheers,
Gerd

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users

Reply via email to