Hi,

I have been experiencing this weird problem with Icinga. At some random times, 
Icinga decides to not send me recovery alerts when service recovers after 
being in problem state.

Following is what I see in logs (filter with SERVICE ALERT and SERVICE 
NOTIFICATION entries):

[1379043382] SERVICE ALERT: host;service;UNKNOWN;SOFT;1;(Timed Out)
[1379043482] SERVICE ALERT: host;service;UNKNOWN;SOFT;2;(Timed Out)
[1379043582] SERVICE ALERT: host;service;UNKNOWN;HARD;3;(Timed Out)
[1379043582] SERVICE NOTIFICATION: admin;host;service;UNKNOWN;notify-service-
by-email;(Timed Out)
[1379046062] SERVICE ALERT: host;service;OK;HARD;1;OK service ok

(host & service masked)

I don't understand how it directly went from UNKNOWN;HARD;3 to OK;HARD;1. I 
have max_check_attempts set to 3. Shouldn't it be OK;SOFT;1 -> OK;SOFT;2 -> 
OK;HARD;3 ?

Also, there is only one notification after UNKNOWN;HARD;3,  and none after it 
recovered at OK;HARD;1. I have w,u,c,r,f in notification_options of both 
contact and service check, and 24x7 timeperiods. My SMTP server is also 
working fine as I am getting tons of alerts everyday. Problem is Icinga did not 
call notify-service-by-email command at all (so no question of SMTP itself).

This is also happening for various other checks, following is another example:

[1379043572] SERVICE ALERT: host;service2;UNKNOWN;SOFT;1;(Timed Out)
[1379043662] SERVICE ALERT: host;service2;UNKNOWN;SOFT;2;(Timed Out)
[1379043762] SERVICE ALERT: host;service2;UNKNOWN;HARD;3;(Timed Out)
[1379043762] SERVICE NOTIFICATION: admin;host;service2;UNKNOWN;notify-service-
by-email;(Service Check Timed Out)
[1379046062] SERVICE ALERT: host;service2;CRITICAL;SOFT;1;Return code of 255 
is out of bounds
[1379046162] SERVICE ALERT: host;service2;OK;SOFT;2;OK: service ok

Again no OK;HARD;3 state after OK;SOFT;2. and no notification after OK;SOFT;2 
either.

I have been struggling with missing recovery alerts since some time now. After 
this I'm not even sure if I'm getting alerts for every problem either. It 
could as well miss a problem alert if it is missing recovery alerts.

Any ideas? Help much appreciated.

Thanks,
Viranch

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users

Reply via email to