Hi guys,

I run a icinga2 cluster with 4 nodes(2 master, 2 checker), and the
scheduling behavior is quite strange!
See my config below. The test-fail service state jumped from 1/5 SOFT, to
1/5 HARD, which should be 1/5 SOFT -> 2/5 SOFT -> ... 5/5 SOFT -> 5/5 HARD.
And the notification of test-fail-10 is late. The HARD alert is at
1456232652, but the notification is at 1456234216, which is the same time
with the second time of the test-fail notification.

 # service.conf

 apply Service "test-fail" {
   max_check_attempts = 5
   check_interval = 1m
   retry_interval = 30s

   check_command = "always-fail"

   assign where host.name == "carl2"
 }

 apply Service "test-fail-10" {
   max_check_attempts = 3
   check_interval = 10m
   retry_interval = 30s

   check_command = "always-fail"

   assign where host.name == "carl2"
 }

 # zones.conf

 object Endpoint "sindar33a.intra.douban.com" {
   host = "sindar33a"
 }
 object Endpoint "sindar33b.intra.douban.com" {
   host = "sindar33b"
 }
 object Endpoint "sindar33c.intra.douban.com" {
   host = "sindar33c"
 }
 object Endpoint "sindar33d.intra.douban.com" {
   host = "sindar33d"
 }

 object Zone "master" {
     endpoints = [
         "sindar33a.intra.douban.com",
         "sindar33b.intra.douban.com",
     ]
 }

 object Zone "checker" {
     endpoints = [
         "sindar33c.intra.douban.com",
         "sindar33d.intra.douban.com",
     ],
     parent = "master"
 }

 admin@sindar33a ~ $  tail -F /var/log/icinga2/compat/icinga.log  |
grep 'carl2;test'
 [1456232407] CURRENT SERVICE STATE: carl2;test-fail;UNKNOWN;SOFT;1;
 [1456232407] CURRENT SERVICE STATE: carl2;test-fail-10;UNKNOWN;SOFT;1;
 [1456232413] SERVICE ALERT: carl2;test-fail;WARNING;HARD;1;Traceback
(most recent call last):
 [1456232652] SERVICE ALERT:
carl2;test-fail-10;WARNING;HARD;1;Traceback (most recent call last):
 [1456234216] SERVICE NOTIFICATION:
lihan-test;carl2;test-fail;WARNING;mail-service-notification;Traceback
(most recent call last):;
 [1456234216] SERVICE NOTIFICATION:
lihan-test;carl2;test-fail-10;WARNING;mail-service-notification;Traceback
(most recent call last):;

 admin@sindar33b ~ $ tail -F /var/log/icinga2/compat/icinga.log  |
grep 'carl2;test'
 [1456232410] CURRENT SERVICE STATE: carl2;test-fail;UNKNOWN;SOFT;1;
 [1456232410] CURRENT SERVICE STATE: carl2;test-fail-10;UNKNOWN;SOFT;1;
 [1456232413] SERVICE ALERT: carl2;test-fail;WARNING;HARD;1;Traceback
(most recent call last):
 [1456232415] SERVICE NOTIFICATION:
admin-test;carl2;test-fail;WARNING;mail-service-notification;Traceback
(most recent call last):;
 [1456232652] SERVICE ALERT:
carl2;test-fail-10;WARNING;HARD;1;Traceback (most recent call last):

 admin@sindar33c ~ $  tail -F /var/log/icinga2/compat/icinga.log  |
grep 'carl2;test'
 [1456232409] CURRENT SERVICE STATE: carl2;test-fail;UNKNOWN;SOFT;1;
 [1456232409] CURRENT SERVICE STATE: carl2;test-fail-10;UNKNOWN;SOFT;1;
 [1456232413] SERVICE ALERT: carl2;test-fail;WARNING;HARD;1;Traceback
(most recent call last):
 [1456232652] SERVICE ALERT:
carl2;test-fail-10;WARNING;HARD;1;Traceback (most recent call last):

 admin@sindar33d ~ $  tail -F /var/log/icinga2/compat/icinga.log  |
grep 'carl2;test'
 [1456232408] CURRENT SERVICE STATE: carl2;test-fail;UNKNOWN;SOFT;1;
 [1456232408] CURRENT SERVICE STATE: carl2;test-fail-10;UNKNOWN;SOFT;1;
 [1456232413] SERVICE ALERT: carl2;test-fail;WARNING;HARD;1;Traceback
(most recent call last):
 [1456232652] SERVICE ALERT:
carl2;test-fail-10;WARNING;HARD;1;Traceback (most recent call last):

Thanks in advance for your help!

Regards
​
-- 
Harry Lee  | SA Dept. | Douban Inc.
_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users

Reply via email to