Hi everybody, I'm looking at an issue with notifications and I'm unsure whether this is working as designed or not.
I'm getting service notifications when a service that has been in that state for a long time changes from WARNING; HARD to CRITICAL;HARD after one check because of a check timeout. Three seconds later, the host check returns with DOWN, SOFT, yet only once, so the host never gets to DOWN, HARD. I thought that if the host is down, no service notifications will be sent. http://docs.icinga.org/latest/en/checkscheduling.html#hostcheckscheduling actually states that "when Icinga is check [sic!] the status of a host, it holds off on doing anything else" - so I would expect it to also not send the service notification I'm seeing until it's sure what the host status is :/ The Log with comments is here: # 1. Status - of dbserver;Disk_E is WARNING;HARD and has been so for a while(also see the last line in this log) Dec 11 23:05:34 icinga_server icinga: SERVICE ALERT: db_server;Disk_E;CRITICAL;HARD;3;CRITICAL - Socket timeout after 10 seconds # 2. When we get a Critical for Disk_E because of the timeout, the status goes to Critical, HARD which conforms to http://docs.icinga.org/latest/en/statetypes.html - 5.8.4 and 5.8.5 # 3. If I understand http://docs.icinga.org/latest/en/checkscheduling.html#hostcheckscheduling correctly, on every service state change, icinga will do a check of the host, to see if its status changed as well. So in this case, a host check should be performed before any further action is taken. What actually happens is that the result is processed and a service notification is send out immediately Dec 11 23:05:34 icinga_server icinga: SERVICE NOTIFICATION: prio1;db_server;Disk_E;CRITICAL;notify_service_email_24x7;CRITICAL - Socket timeout after 10 seconds # 4. Only a few seconds afterswards does icinga show new results for the host state, but the are still SOFT. Dec 11 23:05:37 icinga_server icinga: HOST ALERT: db_server;DOWN;SOFT;1;CRITICAL - Host Unreachable (172.16.28.132) # 5. The host is reachable again. Dec 11 23:08:44 icinga_server icinga: HOST ALERT: db_server;UP;SOFT;2;PING OK - Packet loss = 0%, RTA = 45.00 ms # 6. Service status goes back to Warning. Dec 11 23:20:24 icinga_server icinga: SERVICE ALERT: db_server;Disk_E;WARNING;HARD;3;e:\ - total: 180.00 Gb - used: 163.08 Gb (91%) - free 16.91 Gb (9%) So I'm wondering: is sending notifications on this described change from Warning -> Critical a) the correct behavior or b) should icinga not send this service notification because the host is DOWN and the service state can therefore not be determined. (That would mean, that icinga would wait until it confirmed the host as down, not send any service notifications, but send Host notifications if the Host would go to DOWN, HARD. In the case above where the host comes back before, no notifications would be send at all.) Cheers, Gerd ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ icinga-users mailing list icinga-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/icinga-users