Re: [Nagios-users] different escalation intervals possible?

2011-07-18 Thread Paul M. Dubuc
Michael Barrett wrote:
> Hi - I am trying to get alerts for my services to work like this:
>
> - all alerts (warning, critical, unknown, and recovery) go to the
> 'email-ops' contact every 2 hours - on some services (the ones deemed
> critical) in addition I want them to send an email to the 'primary-pager'
> contact every 15 minutes
>
> I thought I had the configuration setup appropriately for this, but now I'm
> not sure it's possible without a better understanding of how escalations
> work (and it may not even be possible then) since I read this about
> overlapping escalations:
>
> "Since it is possible to have overlapping escalation definitions for a
> particular hostgroup or service, and the fact that a host can be a member
> of multiple hostgroups, Nagios has to make a decision on what to do as far
> as the notification interval is concerned when escalation definitions
> overlap. In any case where there are multiple valid escalation definitions
> for a particular notification, Nagios will choose the smallest notification
> interval."
>
> Anyway, is there anyway to make that work?  The way its working now is that
> it seems to email the email-ops list every 15 minutes on critical services,
> and for email we'd like to get less alerts.
>
> Thanks in advance!
>

I don't think this can be done with escalations.  If Assaf has a way to do it, 
I'd be very interested.  Like the documentation says, any notification that 
matches multiple escalations can only have one notification interval and it 
chooses the smallest.  The way we get around the problem is to put a wrapper 
around the email notification command so that it only sends the first 
notification of a state change.  This is what it looks like:

define command{
 command_namenotify-host-by-email
 command_line\
 if [ $HOSTNOTIFICATIONNUMBER$ -le 1 -o $HOSTSTATEID$ -ne 
$LASTHOSTSTATEID$ ]$USER9$ then \
 /usr/bin/printf "%b\n\n--" \
 "* Nagios *\n\n\
Notification Type: $NOTIFICATIONTYPE$  Number: $HOSTNOTIFICATIONNUMBER$\n\n\
host=$HOSTNAME$\n\n\
Host: $HOSTNAME$\n\
Address: $HOSTADDRESS$\n\
State: $HOSTSTATE$\n\
Last State: $LASTHOSTSTATE$\n\
Info: $HOSTOUTPUT$\n\n\
$LONGHOSTOUTPUT$\n\n\
Date/Time: $LONGDATETIME$\n\n\
Comment: $NOTIFICATIONCOMMENT$" \
 | /usr/bin/mail -r nagios -s \
 "** Nagios $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is 
$HOSTSTATE$ **" $CONTACTEMAIL$ \
 $USER9$\
 fi
 }

The $USER9$ macro is defined as a semicolon ';' to keep it from being 
interpreted as the start of a comment. The command for service e-mail 
notifications looks similar.  So, for e-mail notifications, it doesn't matter 
what the interval is.  Only one e-mail will actually be sent per state change.

--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] different escalation intervals possible?

2011-07-18 Thread Michael Barrett
Thanks Assaf.  It's a little complicated, but here's my config for a 'critical 
service':

### First my escalation definitions:

define serviceescalation {
namepage-primary
first_notification  1
last_notification   0
notification_interval   15
contact_groups  primarypager
servicegroup_name   critical-services
}

define serviceescalation {
namepage-secondary
first_notification  3
last_notification   0
notification_interval   15
contact_groups  secondarypager
servicegroup_name   critical-services
}

define serviceescalation {
nameemail-ops
first_notification  1
last_notification   0
notification_interval   120
contact_groups  ops-group
servicegroup_name   critical-services
}

 Now my contact groups

define contactgroup {
contactgroup_name   ops-group
alias   Operations People
}

define contactgroup {
contactgroup_name   primarypager
alias   Primary Pager
}

define contactgroup {
contactgroup_name   secondarypager
alias   Secondary Pager
}

 Now a pair of example contacts (and templates) that tie into those groups
define contact {
namegeneric-contact
host_notification_period24x7
service_notification_period 24x7
host_notifications_enabled  0
host_notification_options   d,u,r
service_notification_optionsu,c,w,r
host_notification_commands  host-notify-by-email
service_notification_commands   service-notify-by-email
register0
can_submit_commands 1
}

# pager template
define contact {
namepager-template
use generic-contact
host_notification_options   d,u
service_notification_optionsu,c
host_notification_commands  host-notify-by-epager
service_notification_commands   service-notify-by-epager
register0
}

define contact {
nameops-contact
contactgroups   ops-group
use generic-contact
register0
}

define contact {
contact_namemb
alias   Michael Barrett
use ops-contact
email   lok...@gmail.com
}

define contact {
contact_namemb-phone
alias   Mike Phone
contactgroups   primarypager
use pager-template
email   @mms.att.net
}

  Now the servicegroup for critical-services
define servicegroup {
servicegroup_name   critical-services
alias   Services that should page
}

 And finally a service that is in critical services (and its template)
define service {
namegeneric-service
is_volatile 0
check_period24x7
max_check_attempts  3
normal_check_interval   5
retry_check_interval1
active_checks_enabled   1
passive_checks_enabled  1
parallelize_check   0
obsess_over_service 1
check_freshness 0
notifications_enabled   1
notification_interval   120
notification_period 24x7
notification_optionsu,c,w,r
event_handler_enabled   1
flap_detection_enabled  1
process_perf_data   1
retain_status_information   1
retain_nonstatus_information1
contact_groups  ops-group
register0
}


define service {
use generic-service
service_description puppet last run
check_command   check_puppet
hostgroup_name  all_hosts,!virtual-hosts
servicegroups   critical-services
}


##

Let me know if you have any questions, and thanks again!

On Jul 18, 2011, at 2:56 AM, Assaf Flatto wrote:

> Michael Barrett wrote:
>> Hi - I am trying to get alerts for my services to work like this:
>> 
>> - all alerts (warning, critical, unknown, and recovery) go to the 
>> 'email-ops' contact every 2 hours 
>> - on some services (the ones deemed critical) in addition I want them to 
>> send an email to the 'primary-pager' contact every 15 minutes
>> 
>> I thought I had the configuration setup appropriately for this, but now I'm 
>> not sure it's possible without a better understanding of how escalations 
>> work (and it may not even be possible then) since I read this about 
>> overlapping escalations:
>> 
>> "Since it is possible to have overlapping escalation definitions for a 
>> particular hostgroup or service, and the fact that a host can be a member of 
>> multiple hostgroups, Nagios has to make a decisi

Re: [Nagios-users] different escalation intervals possible?

2011-07-18 Thread Assaf Flatto
Michael Barrett wrote:
> Hi - I am trying to get alerts for my services to work like this:
>
> - all alerts (warning, critical, unknown, and recovery) go to the 'email-ops' 
> contact every 2 hours 
> - on some services (the ones deemed critical) in addition I want them to send 
> an email to the 'primary-pager' contact every 15 minutes
>
> I thought I had the configuration setup appropriately for this, but now I'm 
> not sure it's possible without a better understanding of how escalations work 
> (and it may not even be possible then) since I read this about overlapping 
> escalations:
>
> "Since it is possible to have overlapping escalation definitions for a 
> particular hostgroup or service, and the fact that a host can be a member of 
> multiple hostgroups, Nagios has to make a decision on what to do as far as 
> the notification interval is concerned when escalation definitions overlap. 
> In any case where there are multiple valid escalation definitions for a 
> particular notification, Nagios will choose the smallest notification 
> interval."
>
> Anyway, is there anyway to make that work?  The way its working now is that 
> it seems to email the email-ops list every 15 minutes on critical services, 
> and for email we'd like to get less alerts.
>
> Thanks in advance!
>
> --
> Michael Barrett
> lok...@gmail.com
>   
The short answer is , Yes .

if you share your configuration we might also be able to give tips and 
pointers on what needs to be modified.

--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] different escalation intervals possible?

2011-07-15 Thread Michael Barrett
Hi - I am trying to get alerts for my services to work like this:

- all alerts (warning, critical, unknown, and recovery) go to the 'email-ops' 
contact every 2 hours 
- on some services (the ones deemed critical) in addition I want them to send 
an email to the 'primary-pager' contact every 15 minutes

I thought I had the configuration setup appropriately for this, but now I'm not 
sure it's possible without a better understanding of how escalations work (and 
it may not even be possible then) since I read this about overlapping 
escalations:

"Since it is possible to have overlapping escalation definitions for a 
particular hostgroup or service, and the fact that a host can be a member of 
multiple hostgroups, Nagios has to make a decision on what to do as far as the 
notification interval is concerned when escalation definitions overlap. In any 
case where there are multiple valid escalation definitions for a particular 
notification, Nagios will choose the smallest notification interval."

Anyway, is there anyway to make that work?  The way its working now is that it 
seems to email the email-ops list every 15 minutes on critical services, and 
for email we'd like to get less alerts.

Thanks in advance!

--
Michael Barrett
lok...@gmail.com





--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null