My solution would be to make a custom host variable for the timeout, and then 
make your commands use that variable.
http://docs.icinga.org/latest/en/customobjectvars.html

I've used this successfully for very similar needs.


--

Jamila Ruya Khan
(888) 481-3655 x103
[email protected]




On Jul 12, 2015, at 9:36 PM, Jay Newman <[email protected]> wrote:

> Hello, all.
>  
> Hoping someone has encountered something similar and already has a solution. 
> I am trying to find the most straight forward method of allowing a higher 
> timeout value for commands issued against a limited set of hosts or devices 
> that are known to have slow connections. If I just set the “timeout = “ value 
> high enough for the slowest connections in the command definitions, that 
> really is 2 or 3 times as long as I would need or want for most of the hosts 
> I monitor.
> Without this, what we live with is a set of hosts that generate “white noise” 
> by frequently showing as down or with failed service checks – only to reset 
> themselves – or else increase the timeouts and/or number of failed checks to 
> the point where I am ignoring legitimate problems too long.
>  
> What occurs to me is to have two entirely separate command definitions, named 
> for example “my_command” and “my_command_slow” – and define the service 
> checks for the high latency hosts to use the “slow” commands. This seems very 
> primitive and prone to errors – IE admins adding or changing a command and it 
> being overlooked for the “slow” hosts. It would look something like:
> object CheckCommand "my_check" {
> command = ["/path/to/command/command"]
> timeout = 45 } 
>  
> object CheckCommand "my_check_slow" {
> command = ["/path/to/command/command"]
> timeout = 180 } 
>  
> apply service “my_service” {
> assign where match (conditions_for_normal_hosts)
> check_command = “my_check” }
>  
> apply service “my_service_slow” {
> assign where match (conditions_for_slow_hosts)
> check_command = “my_check_slow” }
>  
> Not pretty.
> What would seem to make more sense would be to be able to define a timeout 
> value for the service check itself, so that when I write my apply statement 
> to perform a given check on the servers I tagged as “slow”, that it sets a 
> timeout value. However so far this does not seem to work; timeout does not 
> seem to be a valid attribute for a service.
>  
> object CheckCommand "my_check" {
> command = ["/path/to/command/command"]   } 
>   
> apply service “my_service” {
> assign where match (conditions_for_normal_hosts)
> check_command = “my_check”
> timeout = 45
> }
>  
> apply service “my_service_slow” {
> assign where match (conditions_for_slow_hosts)
> check_command = “my_check”
> timeout = 180
> }
>  
> Suggestions / experience?
>  
> Cheers
> Jay
>  
> _______________________________________________
> icinga-users mailing list
> [email protected]
> https://lists.icinga.org/mailman/listinfo/icinga-users

_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users

Reply via email to