My solution would be to make a custom host variable for the timeout, and then make your commands use that variable. http://docs.icinga.org/latest/en/customobjectvars.html
I've used this successfully for very similar needs. -- Jamila Ruya Khan (888) 481-3655 x103 [email protected] On Jul 12, 2015, at 9:36 PM, Jay Newman <[email protected]> wrote: > Hello, all. > > Hoping someone has encountered something similar and already has a solution. > I am trying to find the most straight forward method of allowing a higher > timeout value for commands issued against a limited set of hosts or devices > that are known to have slow connections. If I just set the “timeout = “ value > high enough for the slowest connections in the command definitions, that > really is 2 or 3 times as long as I would need or want for most of the hosts > I monitor. > Without this, what we live with is a set of hosts that generate “white noise” > by frequently showing as down or with failed service checks – only to reset > themselves – or else increase the timeouts and/or number of failed checks to > the point where I am ignoring legitimate problems too long. > > What occurs to me is to have two entirely separate command definitions, named > for example “my_command” and “my_command_slow” – and define the service > checks for the high latency hosts to use the “slow” commands. This seems very > primitive and prone to errors – IE admins adding or changing a command and it > being overlooked for the “slow” hosts. It would look something like: > object CheckCommand "my_check" { > command = ["/path/to/command/command"] > timeout = 45 } > > object CheckCommand "my_check_slow" { > command = ["/path/to/command/command"] > timeout = 180 } > > apply service “my_service” { > assign where match (conditions_for_normal_hosts) > check_command = “my_check” } > > apply service “my_service_slow” { > assign where match (conditions_for_slow_hosts) > check_command = “my_check_slow” } > > Not pretty. > What would seem to make more sense would be to be able to define a timeout > value for the service check itself, so that when I write my apply statement > to perform a given check on the servers I tagged as “slow”, that it sets a > timeout value. However so far this does not seem to work; timeout does not > seem to be a valid attribute for a service. > > object CheckCommand "my_check" { > command = ["/path/to/command/command"] } > > apply service “my_service” { > assign where match (conditions_for_normal_hosts) > check_command = “my_check” > timeout = 45 > } > > apply service “my_service_slow” { > assign where match (conditions_for_slow_hosts) > check_command = “my_check” > timeout = 180 > } > > Suggestions / experience? > > Cheers > Jay > > _______________________________________________ > icinga-users mailing list > [email protected] > https://lists.icinga.org/mailman/listinfo/icinga-users
_______________________________________________ icinga-users mailing list [email protected] https://lists.icinga.org/mailman/listinfo/icinga-users
