Re: [Nagios-users] Is it possible to recieve a single global notification for all checks?

2013-09-02 Thread Paul M Dubuc
Alex Flex wrote:
 Hello. Thank you for this,definately it looks like the solution  for
 me.. although mk-livestatus looks much unknown .

 Alex

Maybe Nagios BPI will also work for you.
http://exchange.nagios.org/directory/Addons/Components/Nagios-Business-Process-Intelligence-%28BPI%29/details


--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Running a plugin at specific times

2012-08-26 Thread Paul M Dubuc

 On 24 August 2012 22:10, Tech Support supp...@voipbusiness.us
 mailto:supp...@voipbusiness.us wrote:

 Hello;

  I am fairly new to Nagios, and this is my first project using
 it. What I would like to do is run a plugin at specific times of the
 day. This particular plugin is pretty intrusive, so I would like to
 run it only at  7:00am and 7:00pm daily. Is there an easy way of
 doing this? I’m thinking that I can run the script out of CRON, then
 passively send the data to Nagios via its command pipe, but I’m not
 sure if that’s the best way to go. 

Stu Watts wrote: Nagios does time periods itself, so no need for cron:
 
  http://nagios.sourceforge.net/docs/nagioscore/3/en/timeperiods.html
 
  The Nagios documentation is pretty good - have a check through.  Chance
  are it can do what you want.. ;-)
 

I don't think setting time periods will ensure that a check is run at 
specific times.  Best they can do is specify time periods in which they 
may run.

Using cron may be the way to go.

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service and host notifications: best practise

2012-01-27 Thread Paul M. Dubuc
Keith Edmunds wrote:
 Escalations are your friend.

 Thanks for the quick and helpful response. Unless I've misunderstood, we
 would need to configure a service escalation for each service and each of
 the host groups - is that right?

 What we really need it notification rather than escalation, although I
 realise we can use escalations in a similar way to notifications. I'd like
 to be able to say, If any service fails on any host in hostgroup A,
 notify these people.

 Thanks,
 Keith

The only way I can think of to do this is to use a template for services that 
belong to a particular hostgroup:

define service{
namehostgroup_A_service
register0

contactsmanagerA
...
}

Have all the services on hosts in hostgroup A use that template.  Note that 
using the 'contacts' directive won't override any contacts that you specify 
with the contact_groups directive in other templates.  I use 'contacts' to 
specify contacts in addition to the default ones that I specify with 
'contact_groups'.  So you can specify the IT_team as a contact_group for all 
services and hosts and use 'contacts' to specify the manager for particular 
services.

Hope this helps,
Paul Dubuc




--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] [Nagios-devel] timeperiod definition for election day?

2011-12-06 Thread Paul M. Dubuc
Jochen Bern wrote:
 Am I missing something, or would this

 On 12/06/2011 12:44 AM, Andreas Ericsson wrote:
 On 12/05/2011 09:31 PM, Paul M. Dubuc wrote:
 For example, election day in the U.S. is on the 1st Tuesday after
 the 1st Monday of November.

 be equivalent to the Tuesday between 02-Nov and 08-Nov, which, in turn,

 I couldn't even imagine what the syntax would
 look like to support it

 should (!) be equivalent to

 define timeperiod {
   timeperiod_name Election Day
   alias   Shouldnt you be out there voting for someone
   november 2 - 8  00:00-24:00
   exclude AllButTuesdays
   }
 define timeperiod {
   timeperiod_name AllButTuesdays
   alias   Everyone can hate MONDAYS ...
   sunday  00:00-24:00
   monday  00:00-24:00
   wednesday   00:00-24:00
   thursday00:00-24:00
   friday  00:00-24:00
   saturday00:00-24:00
   }

 ?

 Kind regards,
   J. Bern

Amazing.  Thanks!  But until the problem with the 'exclude' directive is fixed 
(see the known issue under 3.2.0 - 08/12/2009 at 
http://www.nagios.org/projects/nagioscore/history/core-3x), we might want to 
do it this way:

  define timeperiod {
timeperiod_name Election Day
alias   Shouldnt you be out there voting for someone
november 2 - 8  00:00-24:00
use AllButTuesdays
}
  define timeperiod {
nameAllButTuesdays  # so 'use' will work above
timeperiod_name AllButTuesdays
alias   Everyone can hate MONDAYS ...
sunday  00:00-00:00
monday  00:00-00:00
wednesday   00:00-00:00
thursday00:00-00:00
friday  00:00-00:00
saturday00:00-00:00
}

Do you think this will also work?


--
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] timeperiod definition for election day?

2011-12-05 Thread Paul M. Dubuc
I didn't see this in the documentation, but I wonder if there is a way to 
specify a timeperiod for the first weekday after another weekday.  For 
example, election day in the U.S. is on the 1st Tuesday after the 1st Monday 
of November.  We have a similar need do define a timeperiod for the 1st Sunday 
after the 1st Saturday of every month.

Must we do this by entering all the specific dates for these in the coming 
year(s), or is there a simpler, no maintenance way of doing it?

Thanks,
Paul Dubuc

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] disabling e-mail notifications for nagiosadmin account

2011-12-05 Thread Paul M. Dubuc
Kaplan, Andrew H. wrote:
 Hi there --

 We are running Nagios 3.3.1, and have a two contacts set up for the e-mail
 notifications. One of the contacts
 is the nagiosadmin user. This is the user account that was first setup during
 the initial installation of the application.

 When the account was set up it was configured with the e-mail address of one
 of our network administrators.

 A second account was set up that was based on the administrator's login
 account along with his e-mail address.
 When notifications are sent out, he gets two notifications for each event due
 to both contacts having the same
 e-mail address.

 We want to prevent the e-mail notifications being sent to the nagiosadmin
 account with the administrator getting
 only one notification per event as the intended result. One thought was to set
 up a dummy account on the Nagios
 server as a solution, and another idea was to set up a flag in the
 contacts.cfg file, but we are not sure what the
 would be the correct syntax for the latter.

 What would be the best solution here?

 Thanks.

nagiosadmin doesn't need to be a notification contact.  You can remove it from 
any contact lists in your contacts.cfg.   If you want that user to still be 
able to see everything and run all commands from the Nagios display you can 
put it in the authorized_for_* lists in your cgi.cfg if its not already there.

Paul Dubuc

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] us-holidays timeperiod error

2011-11-02 Thread Paul M. Dubuc
I apologize if this has already been caught and fixed, but I just noticed that 
the timeperiods.cfg file that comes with Nagios 3.2.3 has an error in the 
us-holidays timeperiod:

 thursday -1 november00:00-00:00 ; Thanksgiving (last Thursday 
in November)

should be

 thursday 4 november 00:00-00:00 ; Thanksgiving (4th Thursday 
in November)

November 2012 has 5 Thursdays.


--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] escalations question

2011-10-26 Thread Paul M. Dubuc
Michael Barrett wrote:
 I was wondering - if a contact is only set to receive critical alerts, and
 via escalations the service is only set to contact that contact with a
 first_notification set as 3, what could cause that contact to get notified
 at the first notification?

 If the service has been in a warning state for a while (more than 3
 notifications, but none of them going to the critical only receiving
 contact since they aren't configured to get warnings) do those
 notifications count towards the first_notification count?

Yes. Each time Nagios generates a notification for any state, the notification
count is incremented.  After the recovery (OK state) notification is sent, the
count is reset to 0.


 I thought we had a pretty cool setup going where our secondary pager would
 only be notified if the service went critical and only after it's third
 critical notification - but this morning both the primary pager  secondary
 pager were notified at the same time for a disk space issue that had been
 in a warning state for a few hours and then went to critical.

All the warning notifications incremented the count, so the count was greater 
than 3 when the the service went critical.  There is no way to specify the 3rd 
CRITICAL notification with escalations.  Notification counts do not take the 
state into account.


 Is there anyway to get that sort of setup working btw?

You might re-think why you want to do this.  If there has been a problem at 
the warning level for 2 or more notification intervals without it being 
acknowledged (which stops notifications) or fixed, maybe your secondary 
contact should be notified anyway when the critical threshold is exceeded.

If you really want it to work the way you describe then the best solution I 
can think of is to have 2 separate services with different contacts.  One that 
issues only warnings and the other only critical problems.  But then you've 
doubled the number of checks you are doing for the same problem.

Paul Dubuc

--
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] howto ignore down hosts

2011-09-28 Thread Paul M. Dubuc
 On Sep 28, 2011, at 10:37 AM, Albrecht Dreß wrote:

 Hi,

 a dumb question - is it possible to ignore hosts which are down, i.e. no
 messages are sent if the machine is down, or goes up again, and no
 service checks are performed while the machine is down?  When the box
 comes up again, the service checks should be run soon if possible.

 This would be nice for boxes which are down regularly (but not according
 to a pre-defined schedule), but have some services which shall be
 monitored, without sending too many mails to the person in charge for
 it...

 I'm running Nagios 3.2.3, self-compiled on Ubuntu 8.04, if that matters.

 Thanks in advance, Albrecht.
 
 Michael Barrett wrote:
  In your host definition set the notification_options so that it doesn't
  notify you when hosts go down/recover:
 
  notification_options d,u,r
 
  (remove the d  r)
 

Service checks would still be run when the host is down though.  If you don't 
want them to run then I think you need to define a servicedependency for those 
services, making them dependent on a master service that monitors the host 
state and setting the execution_failure_criteria to the failure state of the 
master service.

--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service Availability Report Question

2011-09-23 Thread Paul M. Dubuc
I have a question about the Service Availability Report in Nagios 3.2.3 to 
which I can't find an answer in any documentation: Under the Service State 
Breakdowns there is a colored horizontal bar which dotted lines and colored 
segments.  The colored segments represent the duration of the corresponding 
state changes over the specified time period.  What do the dotted lines 
represent?  Most of the time they appear at the beginning of colored, non-OK 
state segment, but I see some that appear alone.

Thanks,
Paul Dubuc

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Hostgroup Members

2011-09-23 Thread Paul M. Dubuc
Brandon Phelps wrote:
 Thanks Dan.  I was aware of the hostgroups directive in the host {} block,
 but for some reason my brain never connected the dots.

 In that case, does anyone know when support was added for host { hostgroups
 = ... }, or simply whether or not it is available in version 1.4?  I have
 googled a bit but can't seem to find the online manual for 1.4.x.

 Thanks again, Brandon

I don't know when it was added, but could you try it and see if it works? If 
nagios 1.4 supports the -v option it would tell you if the hostgroups 
directive isn't recognized.

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] servicedependency not working properly

2011-09-02 Thread Paul M Dubuc
Steve Glasser wrote:
 Hi list,

 We often have nagios checks time out when servers are under heavy load.  One 
 check tests nrpe, if that fails or times out I want notifications for other 
 services on the same host to be suppressed.  To do this I am using 
 servicedepenency.

 Looking at nagios logs I can see that all other checks, both nrpe and remote, 
 are running before test_nrpe.  That means, at least for the first cycle of 
 failed checks, that notifications for all services will be sent.

 Is it possible to control the order in which nagios checks run?  Or am I just 
 doing something wrong?  Please see sample config below:

 define servicedependency {
  host_name   vm-foo2
  service_description test_nrpe
  dependent_host_name vm-foo2
  dependent_service_description   
 nrpe_check_load,nrpe_check_ntp_time,nrpe_check_root,nrpe_check_swap,nrpe_check_ro_mounts
  notification_failure_criteria   c,u
  execution_failure_criteria  n
 }

 Thanks,

I think the problem is that you have 'n' set for the 
execution_failure_criteria.  That means the dependent services will 
always be checked.  Try setting this to 'c,u' instead (same as 
notification_failure_critera)

 From the documentation: 
http://nagios.sourceforge.net/docs/nagioscore/3/en/objectdefinitions.html#servicedependency

 execution_failure_criteria:   This directive is used to specify the
 criteria that determine when the dependent service should not be
 actively checked. If the master service is in one of the failure
 states we specify, the dependent service will not be actively
 checked. Valid options are a combination of one or more of the
 following (multiple options are separated with commas): o = fail on
 an OK state, w = fail on a WARNING state, u = fail on an UNKNOWN
 state, c = fail on a CRITICAL state, and p = fail on a pending state
 (e.g. the service has not yet been checked).  If you specify n (none)
 as an option, the execution dependency will never fail and checks of
 the dependent service will always be actively checked (if other
 conditions allow for it to be). Example: If you specify o,c,u in this
 field, the dependent service will not be actively checked if the
 master service is in either an OK, a CRITICAL, or an UNKNOWN state.


--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] $NOTIFICATIONRECIPIENTS$ macro contents inaccurate

2011-08-22 Thread Paul M. Dubuc
Nagios 3.2.3 documentation says the notification macro:

$NOTIFICATIONRECIPIENTS$ is A comma-separated list of the short names of all 
contacts that are being notified about the host or service.

Instead this macro contains all contacts for the host or service regardless of 
whether the particular notification is actually being sent to them.  I have 
one contact for all services that only gets CRITICAL (c) notifications 
according to its service_notification_options setting, but the 
$NOTIFICATIONRECIPIENTS$ macro includes this contact along with others that 
get WARNING notifications when the WARNING notification is sent.  This would 
imply that the critical only contact also got the notification but this isn't 
true.

Paul Dubuc

--
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Bug?: Custom notifications ignore contact notification options

2011-07-27 Thread Paul M. Dubuc
I just discovered what looks like a possible bug in the Send custom 
host/service notification command in Nagios 3.2.3.  When I use this command 
to send an OK status notification, it goes to all contacts, even ones that are 
only supposed to receive CRITICAL notifications.  The command description says 
that

Custom notifications normally follow the regular notification logic in 
Nagios. Selecting the Forced option will force the notification to be sent 
out, regardless of the time restrictions, whether or not notifications are 
enabled, etc. Selecting the Broadcast option causes the notification to be 
sent out to all normal (non-escalated) and escalated contacts. These options 
allow you to override the normal notification logic if you need to get an 
important message out.

I didn't use either the Forced or Broadcast option and the OK status 
notification goes to contacts that have only 'c' (critical) or 'd' (down) for 
their service/host_notification_options and escalation_options.  Is this a 
bug?  Seems like this should only happen if the Broadcast option is checked.

--
Got Input?   Slashdot Needs You.
Take our quick survey online.  Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] different escalation intervals possible?

2011-07-18 Thread Paul M. Dubuc
Michael Barrett wrote:
 Hi - I am trying to get alerts for my services to work like this:

 - all alerts (warning, critical, unknown, and recovery) go to the
 'email-ops' contact every 2 hours - on some services (the ones deemed
 critical) in addition I want them to send an email to the 'primary-pager'
 contact every 15 minutes

 I thought I had the configuration setup appropriately for this, but now I'm
 not sure it's possible without a better understanding of how escalations
 work (and it may not even be possible then) since I read this about
 overlapping escalations:

 Since it is possible to have overlapping escalation definitions for a
 particular hostgroup or service, and the fact that a host can be a member
 of multiple hostgroups, Nagios has to make a decision on what to do as far
 as the notification interval is concerned when escalation definitions
 overlap. In any case where there are multiple valid escalation definitions
 for a particular notification, Nagios will choose the smallest notification
 interval.

 Anyway, is there anyway to make that work?  The way its working now is that
 it seems to email the email-ops list every 15 minutes on critical services,
 and for email we'd like to get less alerts.

 Thanks in advance!


I don't think this can be done with escalations.  If Assaf has a way to do it, 
I'd be very interested.  Like the documentation says, any notification that 
matches multiple escalations can only have one notification interval and it 
chooses the smallest.  The way we get around the problem is to put a wrapper 
around the email notification command so that it only sends the first 
notification of a state change.  This is what it looks like:

define command{
 command_namenotify-host-by-email
 command_line\
 if [ $HOSTNOTIFICATIONNUMBER$ -le 1 -o $HOSTSTATEID$ -ne 
$LASTHOSTSTATEID$ ]$USER9$ then \
 /usr/bin/printf %b\n\n-- \
 * Nagios *\n\n\
Notification Type: $NOTIFICATIONTYPE$  Number: $HOSTNOTIFICATIONNUMBER$\n\n\
host=$HOSTNAME$\n\n\
Host: $HOSTNAME$\n\
Address: $HOSTADDRESS$\n\
State: $HOSTSTATE$\n\
Last State: $LASTHOSTSTATE$\n\
Info: $HOSTOUTPUT$\n\n\
$LONGHOSTOUTPUT$\n\n\
Date/Time: $LONGDATETIME$\n\n\
Comment: $NOTIFICATIONCOMMENT$ \
 | /usr/bin/mail -r nagios -s \
 ** Nagios $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is 
$HOSTSTATE$ ** $CONTACTEMAIL$ \
 $USER9$\
 fi
 }

The $USER9$ macro is defined as a semicolon ';' to keep it from being 
interpreted as the start of a comment. The command for service e-mail 
notifications looks similar.  So, for e-mail notifications, it doesn't matter 
what the interval is.  Only one e-mail will actually be sent per state change.

--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_interval or normal_check_interval ?

2011-07-11 Thread Paul M. Dubuc
Malcolm Cowe wrote:
 Hello All,

 I have a quick question arising from a discrepancy between the Nagios 3
 documentation and the service templates supplied with the distribution.
 When defining services, should one use check_interval or
 normal_check_interval? I'm currently using Nagios 3.1.0 but will
 likely be upgrading to the latest release in the near future.

They are equivalent.  According to Barth, check_interval was introduced in 3.0 
as an alternative to normal_check_interval.  They mean the same thing.

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Wildcards in service escalations query

2011-07-06 Thread Paul M. Dubuc
Mohit Chawla wrote:
 Just tried this: added all hosts to the host_name field, except the
 ones which don't have any services associated, and it works. So yeah,
 using the * wildcard with !hostx doesn't work. But clearly, this is
 not ideal, since I have had to add around 350 hosts in the host_name
 directive.

I agree.  It would be nice if the serviceescalation definition would 
automatically exclude hosts which don't have services specified by its 
service_description.  Instead of adding all those host names there, you could 
use a host group as I described here:

http://sourceforge.net/mailarchive/message.php?msg_id=27615125

It's a little more work initially, but it's easier to maintain, I think.  You 
won't have to remember to change the escalation every time you add a host. 
It's easier to include a host in the hostgroup you use for the escalation when 
you define the host.

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Wildcards in service escalations query

2011-07-05 Thread Paul M. Dubuc
Mohit Chawla wrote:
 Hi,

 If we have:

 define serviceescalation {
  host_name  *
  service_description*
  ...
 }

 , then, if there is no service associated with a host, this definition
 will be regarded invalid. But what about if a particular service is
 not associated with any host ? Will it fail in that case as well ? I
 was able to find hosts which don't have any services defined, and I
 used:
 define serviceescalation {
  host_name  *, !foo.com, !bar.com
  service_description*
  
 }
 , where foo and bar are the hosts with no services defined. But I
 still get 'could not expand services ' error on this escalation
 definition.

 Any clues ?

As long as any hosts that match the host_name directive have no services 
defined, you will get this error.  The escalation apparently wants to have 
host/service pairs.  It's a service escalation and all services must be 
assigned to a host.  It doesn't automatically discard hosts that have no 
services.  To get around this you can use a hostgroup that contains only hosts 
with services assigned.  I've given an example here: 
http://sourceforge.net/mailarchive/message.php?msg_id=27615125

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Wildcards in service escalations query

2011-07-05 Thread Paul M. Dubuc
Mohit Chawla wrote:
 Hi,

 On Tue, Jul 5, 2011 at 11:55 PM, Paul M. Dubucw...@paul.dubuc.org  wrote:
 As long as any hosts that match the host_name directive have no services
 defined, you will get this error.  The escalation apparently wants to have
 host/service pairs.  It's a service escalation and all services must be
 assigned to a host.  It doesn't automatically discard hosts that have no
 services.

 But as you can see in the above config I posted, I am explicitly
 excluding those hosts which do not have any services associated with
 them ( foo.com and bar.com ). Hence, the config should be valid.
 Unless ofcourse: host_name *. !host1, !host2 is not the right way to
 include all hosts except host1 and host2 or some other bad logic.

It could be that the exclusion (!) doesn't work when combined with the * 
wildcard in that way.  It's equivalent to host1, host2, ... hostN, !host1, 
!host2.  Try putting the wildcard at the end of the list and see if that 
works.  Also, make sure that the hosts you exclude are really the only ones 
that have no services.  Nagios will put warnings in the log file about hosts 
with no services assigned after it is restarted.  You can look there for any 
you might have missed.

Paul Dubuc

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Getting re-notified while in a HARD state

2011-06-29 Thread Paul M. Dubuc
Frank Bulk wrote:
 I have a few existing and self-developed plugins that output details of the
 HARD state:

   CRITICAL: critical 1, warning 1
   Detail 1
   Detail 2

 What I'd like to do is to be able to be re-notified if, while in the HARD
 state, the number and/or details change.  For example, if the above would go
 to:

   CRITICAL: critical 2, warning 1
   Detail 1
   Detail 2
   Detail 3

 Anyone have an approach that works?  The documentation doesn't indicate it's
 possible, but I'm sure others have encountered this before and perhaps
 they've worked through a solution.

 Kind regards,

 Frank

I don't think there's a simple way to do this without having your notification 
command store the value of the $SERVICEOUTPUT$ macro for the host + service 
for comparison on the next try. Then you would have to set is_volatile on the 
service and have the notification command suppress the notification if the 
$SERVICEOUTPUT$ doesn't change.

Another thing you can do is tell Nagios to log the hard state status when only 
the $SERVICEOUTPUT$ changes by setting the stalking_options in the service. 
Then, if you have something that watching the log file you can trigger 
notifications with that.  If only this state stalking feature would have an 
option to send notifications in addition to logging you would be set.

Hope this helps,
Paul Dubuc

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Expand service group error in 43 line test config (why?)

2011-06-28 Thread Paul M. Dubuc
First, your service has no service_description specified.  This is required.

Second, your serviceescalation must include the host_name that the service is 
assigned to.  Add the line:

host_name   admin.qa

and it will work.  You can also use a hostgroup_name instead of a host_name, 
but every host you specify must have a service with a service_description that 
matches that specified in the escalation.

See the documentation for details:
http://nagios.sourceforge.net/docs/nagioscore/3/en/objectdefinitions.html


Eric B. wrote:
 This has me stumped. I whittled my ugly config down to 35 lines, and was
 still able to re-create the error. Any ideas what is wrong? I'm running
 Nagios Core v. 3.2.3. Much thanks in advance!

 -Eric

 Error is:

 Error: Could not expand servicegroups specified in service escalation
 (config file '/home/opsmon/etc/nagios/objects/qbo/foo.cfg', starting on
 line 13)
 Error processing object config files!

 Here's the config:

 define servicegroup {
 servicegroup_name   group-1
 alias   All Services
 register0
 }

 define contact {
 contact_nameprimary-oncall
 alias   Primary Oncall
 email f...@bar.com mailto:f...@bar.com
 }

 define serviceescalation {
 servicegroup_name   group-1
 first_notification  1
 last_notification   6
 notification_interval   5
 contactsprimary-oncall
 }

 define service {
 servicegroups   group-1
 host_name admin.qa http://admin.qa
 check_command   check_foo
 }

 define host {
 host_name admin.qa http://admin.qa
 address 127.0.0.1
 }

 define command {
 command_name   check_foo
 command_line   /bin/true
 }



 --
 All of the data generated in your IT infrastructure is seriously valuable.
 Why? It contains a definitive record of application performance, security
 threats, fraudulent activity, and more. Splunk takes this data and makes
 sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-d2d-c2



 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Expand service group error in 43 line test config (why?)

2011-06-28 Thread Paul M. Dubuc
Eric B. wrote:
 My real problem I think is that I 'whittled' it down in the wrong way
 (thanks for the help, everyone). Below is what I was hoping to do, but
 realize that b/c I HAVE to define a host w/ the escalation, I have to
 retool how my monster config is done (which will really suck). Here's
 what I was hoping to accomplish:

 1) Create a generic service template that all service checks inherit
 that adds them to the 'all-services' group.
 2) Create escalation rules that apply to the 'all-services' group.

 This worked (basically a more complicated example of  the config I gave)
 until I added a 'all-services-foo' group (same method mentioned in #1
 and #2) with different escalations.

  From a design perspective, I know Nagios does a great job w/
 templating, and object inheritance, but it really sucks that I have to
 specify a host; that just increased the amount of objects easily by an
 order or so of magnitude.


I don't see why.  All services have to be assigned to hosts anyway.  You can 
specify a comma separated list of hosts in your escalation or use hostgroups.
I think you only need 2 additional objects to do what you want:  A hostgroup 
that consists of all hosts with services assigned and a host template to 
assign hosts to that group.  There's an example that might help here:

http://sourceforge.net/mailarchive/message.php?msg_id=27615125


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] [Nagios-devel] Nagios retries checks too soon.

2011-06-10 Thread Paul M Dubuc
Jochen Bern wrote:
 On 06/09/2011 08:14 PM, Paul M. Dubuc wrote:
 Andreas Ericsson wrote:
 I'm not sure. I'm also not sure which behaviour is intended. Arguably, 
 either
 is correct and Nagios is doing one of two right things.
 I'm not sure.  If a test times out and Nagios tries again 10 seconds later
 instead of the 60 seconds specified, that could cause problems; load related
 problems when you have many of these tests running and timing out and 
 problems
 for the system under test not having sufficient time to recover before the
 next check is done.

 True, but *if* someone has the latter kind of problem, I'd expect him to
 keep it in mind while writing the configuration, too. IIRC, the actual
 code adds check_interval/retry_interval to the variable that holds the
 (previous) scheduled check time - i.e., the time when the previous check
 supposedly was *started* (assuming negligible check latency).
 Configuring a retry_interval of one minute for a service whose sustained
 request rate may be *less* than one per minute sounds dubitable to me.

 (And I'm a firm nonbeliever in Unix-ish load figures, as opposed to
 actual CPU usage etc., but that's a different rant.)

 Kind regards,
   J. Bern

Thanks for this explanation.  It helps quite a bit. The checks we run 
normally take 5 - 15 seconds to complete, but we allow a much longer 
value for timeout.  I was under the impression that the retry interval 
was only counted from the time the previous check completes and the 
status (which is needed to determine if a retry is necessary) is known. 
  Why is the retry time determined before it's know that one is needed? 
  It looks like checks that have longer timeouts need to have longer 
retry intervals to compensate for the worst case.  That's not intuitive 
to me, but I can live with it.

Paul Dubuc

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Nagios retries checks too soon.

2011-06-09 Thread Paul M. Dubuc
Running Nagios 3.2.3, here is an example from the log that shows Nagios 
retrying a failed check after only 10 seconds.  The normal check interval is 
7.5 minutes, retry interval is 1 minute, max. check attempts is 3.

Note that this test has a timeout of 130 seconds, so it's been running for 
over 2 minutes when it times out.  Does Nagios do retries sooner when the 
timeout for a check is longer than the retry interval?  Is the retry interval 
measured from the time the previous check starts, or from the time it ends?

[06-09-2011 09:16:14] SERVICE ALERT:
APS-P55;LoginPage;CRITICAL;SOFT;1;logintest CRITICAL - Timeout (130 sec.) 
reached

[06-09-2011 09:16:24] SERVICE ALERT: 
APS-P55;LoginPage;OK;SOFT;2;logintest OK



--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] [Nagios-devel] Nagios retries checks too soon.

2011-06-09 Thread Paul M. Dubuc
Andreas Ericsson wrote:
 On 06/09/2011 03:43 PM, Paul M. Dubuc wrote:
 Running Nagios 3.2.3, here is an example from the log that shows Nagios
 retrying a failed check after only 10 seconds.  The normal check interval is
 7.5 minutes, retry interval is 1 minute, max. check attempts is 3.

 Note that this test has a timeout of 130 seconds, so it's been running for
 over 2 minutes when it times out.  Does Nagios do retries sooner when the
 timeout for a check is longer than the retry interval?  Is the retry interval
 measured from the time the previous check starts, or from the time it ends?


 I'm not sure. I'm also not sure which behaviour is intended. Arguably, either
 is correct and Nagios is doing one of two right things.


I'm not sure.  If a test times out and Nagios tries again 10 seconds later 
instead of the 60 seconds specified, that could cause problems; load related 
problems when you have many of these tests running and timing out and problems 
for the system under test not having sufficient time to recover before the 
next check is done.

Paul Dubuc

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] service escalation on all services of all hosts

2011-06-07 Thread Paul M. Dubuc

Michael Barrett wrote:
 Hi, I'm having a problem with an example given in the Tips  Tricks 
 documentation page. Currently I'm running: Nagios Core 3.2.0

 Anyway, the tip I'm trying is from here 
 http://nagios.sourceforge.net/docs/nagioscore/3/en/objecttricks.html#serviceescalation

 The particular tip reads:

 All Services On Same Host:
 If you want to create service escalations for all services assigned to a 
 particular host, you can use a wildcard in the service_description directive. 
 The definition below would create a service escalation for all services on 
 host HOST1. All the instances of the service escalation would be identical 
 (i.e. have the same contact groups, notification interval, etc.).

 If you feel like being particularly adventurous, you can specify a wildcard 
 in both the host_name and service_description directives. Doing so would 
 create a service escalation for all services that you've defined in your 
 configuration files.

 ##

 So I tried the following:

 define serviceescalation {
  nameemail-all
  first_notification  1
  last_notification   0
  notification_interval   120
  contact_groups  ops-group

  register 0
 }

 define serviceescalation {
  use email-all
  host_name   *
  service_description *
 }

 And when I go to restart nagios I get the following:

 Error: Could not expand hostgroups and/or hosts specified in service (config 
 file '/etc/nagios3/conf.d/services.cfg', starting on line 34)
 Error processing object config files!


 Anyone know why this is a problem?  Am I missing something in the 
 documentation, or is it just incorrect?


You probably have some hosts that have no services assigned.  Using the 
wildcard for both host_name and service_description will not work in that 
case, unfortunately.  All hosts specified MUST have a service that matches the 
given service_description or you will get this error.

Paul Dubuc

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] service escalation on all services of all hosts

2011-06-07 Thread Paul M. Dubuc
Michael Barrett wrote:
 Ahh, ok, that would explain it.  That's a bummer.   Thanks.

One way around this is to create a host group where you put hosts that have 
services assigned:

define hostgroup{
 hostgroup_name  ServiceHosts
 alias   Hosts with Services Assigned
 register0   ; hide this hostgroup unless you want it displayed.
}

Then use a template that assigns hosts to this group:

define host{
 nameservice-host
 register0   ; this is a template

 hostgroups  +ServiceHosts; add host to this hostgroup
}

Make sure every host definition that has services assigned has the
use service-host
directive in it (or uses a template that does).  Alternatively you can just 
assign the services to the ServiceHosts group in the service definitions 
instead of using this host template.

Then you can define your escalation this way:

define serviceescalation {
   use email-all
   hostgroup_name  ServiceHosts
   service_description *
}

Hope this helps.

Paul Dubuc


 On Jun 7, 2011, at 10:29 AM, Paul M. Dubuc wrote:


 Michael Barrett wrote:
 Hi, I'm having a problem with an example given in the Tips   Tricks 
 documentation page. Currently I'm running: Nagios Core 3.2.0

 Anyway, the tip I'm trying is from here 
 http://nagios.sourceforge.net/docs/nagioscore/3/en/objecttricks.html#serviceescalation

 The particular tip reads:

 All Services On Same Host:
 If you want to create service escalations for all services assigned to a 
 particular host, you can use a wildcard in the service_description 
 directive. The definition below would create a service escalation for all 
 services on host HOST1. All the instances of the service escalation would 
 be identical (i.e. have the same contact groups, notification interval, 
 etc.).

 If you feel like being particularly adventurous, you can specify a wildcard 
 in both the host_name and service_description directives. Doing so would 
 create a service escalation for all services that you've defined in your 
 configuration files.

 ##

 So I tried the following:

 define serviceescalation {
  nameemail-all
  first_notification  1
  last_notification   0
  notification_interval   120
  contact_groups  ops-group

  register 0
 }

 define serviceescalation {
  use email-all
  host_name   *
  service_description *
 }

 And when I go to restart nagios I get the following:

 Error: Could not expand hostgroups and/or hosts specified in service 
 (config file '/etc/nagios3/conf.d/services.cfg', starting on line 34)
 Error processing object config files!


 Anyone know why this is a problem?  Am I missing something in the 
 documentation, or is it just incorrect?


 You probably have some hosts that have no services assigned.  Using the 
 wildcard for both host_name and service_description will not work in that 
 case, unfortunately.  All hosts specified MUST have a service that matches 
 the given service_description or you will get this error.

 Paul Dubuc

 --
 Michael Barrett
 lok...@gmail.com






--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Scheduled downtime and host checks

2011-06-01 Thread Paul M. Dubuc
Jeffrey Watts wrote:
 On Wed, Jun 1, 2011 at 1:27 AM, Kumar, Ashish xml.de...@gmail.com
 mailto:xml.de...@gmail.com wrote:


 No, scheduled downtime only affects notifications, and the stats you
 see in the availability cgi.  Service and host checks run as normal
 during scheduled downtime.


 Thanks Jim for the explanation but I do not see any rational reason
 to execute host and service checks while the monitored host is
 scheduled for fixed downtime.


 There are plenty of rational reasons.  Just because you disagree with
 the default behavior doesn't mean it's irrational.  Many, many, many
 times I put systems into scheduled, fixed downtime and still want checks
 to be executed.  For example, if I know the netadmins are going to be
 reconfiguring networking at one of our datacenters I will schedule fixed
 downtime for the period of their maintenance for the
 servers/switches/routers affected.

 However, I do want to see what's up and down during that time so I can
 tell when they start and finish their work, and what they're affecting.
   That's a perfectly rational reason to do checks during maintenance.

 This is useful because it allows you to
 check the stats of those hosts and services are ok before the
 scheduled downtime period ends.


 But if the host/services are offline after the scheduled fixed
 downtime period ends it will send the notifications anyway (or would
 it not?)

 I wish there was a way to disable active checks while a host has
 scheduled downtime set.


 If the hosts and services are down after the downtime ends yes it will
 send notifications, as clearly either:

 1) The maintenance window wasn't long enough.
 2) Someone broke something, or something died for another reason during
 maintenance

 Sounds like proper behavior.

 As far as your question goes, you can disable active checks manually, or
 you can write a script that sets downtime and disables active checks at
 the same time.  You could then run it (manually or via 'at' or something
 else) to re-enable active checks.  Or hack the Nagios source code and
 add that option yourself.  I believe in the last week or so someone
 posted a sample script for setting downtime via a script, so you might
 search the archives.

 Jeffrey.

You give some very good reasons for Nagios current behavior during a downtime. 
  But I agree with the original request that there be an option to disable 
checks during a downtime because there are equally rational reasons to do so.

There are some cases where we really should not be running service checks 
during down times because of the extra load they put on our system when they 
fail.   Many of our checks fail in this case by timing out and they use 
relatively scarce (shared) and resource intensive processes (web browser 
sessions run under SeleniumRC).  Timeouts tend to be long for these checks so 
there is more contention for these processes when all the checks using them 
start failing, and they're run more often until they all go into a 'hard' 
failure state, etc.  Maybe we can live with this, but it would be easier on 
the system to just inhibit checks we know are going to fail during certain 
regularly scheduled down times.  There may be plenty of other examples where 
running lots of failing tests during a downtime end up using significant 
system resources.

We implement our regular downtimes by using by defining the uptime with a 
timeperiod and using that for the check_period and notification_period of our 
services.  The problem with that is that all the services get scheduled to run 
at the exact second that our downtime ends.  So we have to define a 
concurrency limit and rely on nagios nudging checks out when the limit is 
reached in order to spread the schedule out again.

It would be very nice to be able to define regular downtimes with timeperiods 
and have the option of inhibiting checks as well as notifications during those 
downtimes without bunching up the scheduling queue when the downtime ends.

--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] [Nagios-devel] Q: Service Escalation Recovery Notifications.

2011-05-26 Thread Paul M. Dubuc
I should have mentioned that whether this works depends on who the default, 
non-escalated, contacts are for the host or service.  In your case, since you 
have last_notification set to 3, those contacts in your escalation will not 
get a recovery notification that is numbered 5 or greater unless they also 
happen to be the default contact for the host or service, which will get the 
problem notification number 4 and all non-escalated notifications.  If you 
escalate a notification to a contact that is not assigned as a regular contact 
for the host or service, those contacts don't get the recovery notification 
(unless they also got the previous problem notification) even if you set up a 
separate escalation for the recovery notification that specifies all previous 
contacts.

Patrik Båt wrote:
 Are you sure about that?

 The documentation says:
 If, after three problem notifications, a recovery notification is sent
 out for the service, who gets notified? The recovery is actually the
 fourth notification that gets sent out. However, the escalation code is
 smart enough to realize that only those people who were notified about
 the problem on the third notification should be notified about the
 recovery. In this case, the nt-admins and managers contact groups would
 be notified of the recovery.

 On Wed, 2011-05-25 at 13:56 -0400, Paul M. Dubuc wrote:
 This works as long as the problem doesn't last longer than 3 notification
 intervals.  Recovery notifications that are numbered higher than 4 won't be 
 sent.

 Patrik Båt wrote:
   # SMS
   define serviceescalation {
host_name *
service_description *
first_notification 2
last_notification 3
notification_interval 0
contacts oncall
 }
 
   define hostescalation {
host_name *
first_notification 2
last_notification 3
notification_interval 0
contacts oncall
 }
 
   # MAIL
 
   define serviceescalation {
host_name *
service_description *
first_notification 1
last_notification 1
notification_interval 10
contacts sysadmin.reports
 }
 
   define hostescalation {
host_name *
first_notification 1
last_notification 1
notification_interval 10
contacts sysadmin.reports
 }
 
   # Recovery
 
   define serviceescalation {
host_name *
service_description *
first_notification 2
last_notification 3
notification_interval 0
contacts sysadmin.reports
escalation_options r
   }
 
   define hostescalation {
host_name *
first_notification 2
last_notification 3
notification_interval 0
contacts sysadmin.reports
escalation_options r
 
 
   This is working for me, to notify both via sms and email. eg 2 contacts.
 
 
 
   On Fri, 2011-05-20 at 22:22 +0200, Andreas Ericsson wrote:
   On 05/20/2011 06:05 PM, Max Schubert wrote:
   Hi,
 
   On Thu, May 19, 2011 at 10:10 AM, Andreas Ericssona...@op5.se  
  mailto:a...@op5.sewrote:
   On 05/19/2011 03:32 PM, Paul M. Dubuc wrote:
   OK, but wouldn't it be nice if all contacts who got an error 
  notification were
   able to get the recovery message instead of just the one last 
  notified?  Is
   there any way to do that?  Setting up an explicit serviceescalation 
  for
   recovery notifications doesn't seem to work.
 
 
   Max Schubert is working on a patch that does something similar to that.
   If he doesn't complete it, I might take a look at adding it myself.
 
   I will send out my partial patch to the list sometime today along with
   an explanation of my thinking / approach for it - feel free to use it
   or discard it as you see fit :)!
 
 
   Rest assured, I will ;)
 
   Our customers have raised voices about simplifying the notification
   logic though. This discussion actually spawned that voice-raising,
   which is nice. Either way, it might be that I end up either taking
   your patch or implementing theeveryone who gets problem notifications
   also get recovery notifications.
 



--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] [Nagios-devel] Q: Service Escalation Recovery Notifications.

2011-05-26 Thread Paul M. Dubuc
Because they HAVE been informed of the problem by earlier notifications, but 
not the one notification prior to the recovery.  It leaves those contacts 
wondering if the problem was ever fixed.

Patrik Båt wrote:
 Why just send a recovery to someone who hasnt been informed of
 problem? :P

 On Thu, 2011-05-26 at 09:43 -0400, Paul M. Dubuc wrote:
 I should have mentioned that whether this works depends on who the default,
 non-escalated, contacts are for the host or service.  In your case, since you
 have last_notification set to 3, those contacts in your escalation will not
 get a recovery notification that is numbered 5 or greater unless they also
 happen to be the default contact for the host or service, which will get the
 problem notification number 4 and all non-escalated notifications.  If you
 escalate a notification to a contact that is not assigned as a regular 
 contact
 for the host or service, those contacts don't get the recovery notification
 (unless they also got the previous problem notification) even if you set up a
 separate escalation for the recovery notification that specifies all previous
 contacts.

--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] [Nagios-devel] Q: Service Escalation Recovery Notifications.

2011-05-25 Thread Paul M. Dubuc
This works as long as the problem doesn't last longer than 3 notification 
intervals.  Recovery notifications that are numbered higher than 4 won't be 
sent.

Patrik Båt wrote:
 # SMS
 define serviceescalation {
  host_name *
  service_description *
  first_notification 2
  last_notification 3
  notification_interval 0
  contacts oncall
   }

 define hostescalation {
  host_name *
  first_notification 2
  last_notification 3
  notification_interval 0
  contacts oncall
   }

 # MAIL

 define serviceescalation {
  host_name *
  service_description *
  first_notification 1
  last_notification 1
  notification_interval 10
  contacts sysadmin.reports
   }

 define hostescalation {
  host_name *
  first_notification 1
  last_notification 1
  notification_interval 10
  contacts sysadmin.reports
   }

 # Recovery

 define serviceescalation {
  host_name *
  service_description *
  first_notification 2
  last_notification 3
  notification_interval 0
  contacts sysadmin.reports
  escalation_options r
 }

 define hostescalation {
  host_name *
  first_notification 2
  last_notification 3
  notification_interval 0
  contacts sysadmin.reports
  escalation_options r


 This is working for me, to notify both via sms and email. eg 2 contacts.



 On Fri, 2011-05-20 at 22:22 +0200, Andreas Ericsson wrote:
 On 05/20/2011 06:05 PM, Max Schubert wrote:
 Hi,

 On Thu, May 19, 2011 at 10:10 AM, Andreas Ericssona...@op5.se   wrote:
 On 05/19/2011 03:32 PM, Paul M. Dubuc wrote:
 OK, but wouldn't it be nice if all contacts who got an error notification 
 were
 able to get the recovery message instead of just the one last notified?  
 Is
 there any way to do that?  Setting up an explicit serviceescalation for
 recovery notifications doesn't seem to work.


 Max Schubert is working on a patch that does something similar to that.
 If he doesn't complete it, I might take a look at adding it myself.

 I will send out my partial patch to the list sometime today along with
 an explanation of my thinking / approach for it - feel free to use it
 or discard it as you see fit :)!


 Rest assured, I will ;)

 Our customers have raised voices about simplifying the notification
 logic though. This discussion actually spawned that voice-raising,
 which is nice. Either way, it might be that I end up either taking
 your patch or implementing the everyone who gets problem notifications
 also get recovery notifications.




 --
 vRanger cuts backup time in half-while increasing security.
 With the market-leading solution for virtual backup and recovery,
 you get blazing-fast, flexible, and affordable data protection.
 Download your free trial now.
 http://p.sf.net/sfu/quest-d2dcopy1



 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Q: Service Escalation Recovery Notifications.

2011-05-19 Thread Paul M. Dubuc
OK, but wouldn't it be nice if all contacts who got an error notification were 
able to get the recovery message instead of just the one last notified?  Is 
there any way to do that?  Setting up an explicit serviceescalation for 
recovery notifications doesn't seem to work.

Yueh-Hung Liu wrote:
 by the examples from nagios documentation, only on-call-support will
 get the 6th and above notifications.


 On Thu, May 19, 2011 at 4:33 AM, Paul M. Dubucw...@paul.dubuc.org  wrote:
 Here is an example from the Nagios 3.2.3 documentation on service 
 escalations.

 Recovery Notifications

 Recovery notifications are slightly different than problem notifications
 when it comes to escalations. Take the following example:

 define serviceescalation{

 host_name webserver

 service_description   HTTP

 first_notification3

 last_notification 5

 notification_interval 20

 contact_groupsnt-admins,managers

 }



 define serviceescalation{

 host_name webserver

 service_description   HTTP

 first_notification4

 last_notification 0

 notification_interval 30

 contact_groupson-call-support

 }


 If, after three problem notifications, a recovery notification is sent out
 for the service, who gets notified? The recovery is actually the fourth
 notification that gets sent out. However, the escalation code is smart
 enough to realize that only those people who were notified about the
 problem on the third notification should be notified about the recovery. In
 this case, the nt-admins and managers contact groups would be notified of
 the recovery.

 My question is who gets the recovery notification after 6 problem
 notifications?  Only on-call-support (the last one notified), or all three
 contact groups (since all received notifications of the problem)?  If only
 on-call-support (which seems to be the case), how can I ensure that the 
 others
 get it too?

 I tried adding a service escalation for the recovery notification, like so in
 keeping with the above example:

 define serviceescalation{

   host_name webserver

   service_description   HTTP

   first_notification2

   last_notification 0

   escalation_options r

   contact_groupson-call-support,nt-admins,managers

   }

 but that doesn't seem to work. I had thought this fixed the problem but the
 recovery notification only seems to go to the last contact(s) that were
 notified of the problem.


 --
 What Every C/C++ and Fortran developer Should Know!
 Read this article and learn how Intel has extended the reach of its
 next-generation tools to help Windows* and Linux* C/C++ and Fortran
 developers boost performance applications - including clusters.
 http://p.sf.net/sfu/intel-dev2devmay
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


 --
 What Every C/C++ and Fortran developer Should Know!
 Read this article and learn how Intel has extended the reach of its
 next-generation tools to help Windows* and Linux* C/C++ and Fortran
 developers boost performance applications - including clusters.
 http://p.sf.net/sfu/intel-dev2devmay
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Q: Service Escalation Recovery Notifications.

2011-05-18 Thread Paul M. Dubuc
Here is an example from the Nagios 3.2.3 documentation on service escalations.

 Recovery Notifications

 Recovery notifications are slightly different than problem notifications
 when it comes to escalations. Take the following example:

 define serviceescalation{

 host_name webserver

 service_description   HTTP

 first_notification3

 last_notification 5

 notification_interval 20

 contact_groupsnt-admins,managers

 }



 define serviceescalation{

 host_name webserver

 service_description   HTTP

 first_notification4

 last_notification 0

 notification_interval 30

 contact_groupson-call-support

 }


 If, after three problem notifications, a recovery notification is sent out
 for the service, who gets notified? The recovery is actually the fourth
 notification that gets sent out. However, the escalation code is smart
 enough to realize that only those people who were notified about the
 problem on the third notification should be notified about the recovery. In
 this case, the nt-admins and managers contact groups would be notified of
 the recovery.

My question is who gets the recovery notification after 6 problem 
notifications?  Only on-call-support (the last one notified), or all three 
contact groups (since all received notifications of the problem)?  If only 
on-call-support (which seems to be the case), how can I ensure that the others 
get it too?

I tried adding a service escalation for the recovery notification, like so in 
keeping with the above example:

define serviceescalation{

  host_name webserver

  service_description   HTTP

  first_notification2

  last_notification 0

  escalation_options r

  contact_groupson-call-support,nt-admins,managers

  }

but that doesn't seem to work. I had thought this fixed the problem but the 
recovery notification only seems to go to the last contact(s) that were 
notified of the problem.


--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Segmentation fault Nagios 3.2.3

2011-05-17 Thread Paul M. Dubuc



 Recompiled without embedded-perl option, now it's working fine .
 Still I am not able to understand why it was happened.


 /\
 dE

These messages look suspicious:

 access(/etc/ld.so.preload, R_OK)  = -1 ENOENT (No such file or 
 directory)
  
open(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/tls/x86_64/libperl.so,
  O_RDONLY) = -1 ENOENT (No such file or directory)
  stat(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/tls/x86_64,
  0x7fff00917240) = -1 ENOENT (No such file or directory)
  open(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/tls/libperl.so,
  O_RDONLY) = -1 ENOENT (No such file or directory)
  stat(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/tls,
  0x7fff00917240) = -1 ENOENT (No such file or directory)
  open(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/x86_64/libperl.so,
  O_RDONLY) = -1 ENOENT (No such file or directory)
  stat(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/x86_64,
  0x7fff00917240) = -1 ENOENT (No such file or directory)
  open(/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/libperl.so,

Looks like you're running Nagios on a system that doesn't have (64-bit) perl 
libs installed.


--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Flap Detection: Why do only HARD state changes count?

2011-05-16 Thread Paul M. Dubuc
This isn't explicitly stated in the documentation, but it seems that flap 
detection state changes only apply to HARD states.  So it's possible that a 
service check and toggle back and forth indefinitely between OK and not OK 
(unless max_check_attempts is set to 1) and flapping will not be detected.  I 
tested this with a service that does this and verified the behavior.  The Last 
State Change time gets updated with each SOFT state change, but the % state 
change for flap detection remains at 0% until I set max_check_attempts to 1 
and let it toggle between hard state changes.

Is this a bug or is it by design?  Is there a way to include SOFT state 
transitions in flap detection?

I'm using Nagios Core 3.2.3.

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] notification_interval normal_check_interval

2011-04-18 Thread Paul M. Dubuc

Mike Chesnut wrote:
 I have a check that I only want to occur once a day, so I do this in the
 service definition:

   normal_check_interval   1440

 However, when it fails, I want it to retry every 10 minutes, so I do this:

   retry_check_interval10

 My default notification_interval is set to 15.  When I run a pre-flight
 check, I get this:

 Warning: Service 'service' on host'host'  has a notification
 interval less than its check interval!  Notifications are only re-sent
 after checks are made, so the effective notification interval will be
 that of the check interval.

 Is that warning telling me that notifications are only sent when a
 normal check occurs?  What I want is for in the event of a failure,
 notifications to continue to be sent (every 15 minutes) until the
 service recovers.  Will that be the case?

 Thanks,
 Mike


What is the value of max_check_attempts?  It's at the end of that number of 
checks that the service enters a hard state and a notification is sent.  If 
the value is 1, then the warning makes perfect sense because no retry checks 
will be done.

Paul Dubuc

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] notification_interval normal_check_interval

2011-04-18 Thread Paul M Dubuc
Mike Chesnut wrote:
 On 04/18/2011 12:08 PM, Paul M. Dubuc wrote:

 Mike Chesnut wrote:
 I have a check that I only want to occur once a day, so I do this in the
 service definition:

 normal_check_interval   1440

 However, when it fails, I want it to retry every 10 minutes, so I do this:

 retry_check_interval10

 My default notification_interval is set to 15.  When I run a pre-flight
 check, I get this:

 Warning: Service 'service' on host'host'  has a notification
 interval less than its check interval!  Notifications are only re-sent
 after checks are made, so the effective notification interval will be
 that of the check interval.

 Is that warning telling me that notifications are only sent when a
 normal check occurs?  What I want is for in the event of a failure,
 notifications to continue to be sent (every 15 minutes) until the
 service recovers.  Will that be the case?

 Thanks,
 Mike


 What is the value of max_check_attempts?  It's at the end of that number of
 checks that the service enters a hard state and a notification is sent.  If
 the value is 1, then the warning makes perfect sense because no retry checks
 will be done.

 max_check_attempts is 2.  Is that a sensible number here?

 Thanks,
 Mike


OK, I think it will work this way:  You will get a notification if there 
is still a problem after the retry check.  After that, the check 
interval reverts to the normal interval and, if the problem persists 
after the retry, you will not get another notification until after the 
next normal interval check.  You will not get a recovery notification 
until then either if the problem clears up unless you rerun the check 
manually.  This doesn't sound like what you want.  I don't think you can 
do what you want without shortening the normal check interval.

Paul Dubuc

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Escalating notifications

2011-04-01 Thread Paul M. Dubuc
Patrik Båt wrote:
 Hello mailinglist!

 im trying to get a notification like this:

 in first hardstate, email staff. (notication 1)

 at the other notification (notification 2) im sending a SMS to the
 oncall.

 But the problem is, that on recovery im only getting a SMS due to the
 sms escalation is in use.

 Anyone have any good way to get this to work?

 1. MAIL Problem
 2. SMS Problem

 On recovery:

 1. Mail Recovery
 2. SMS Recovery

 with 2 escalations, i get like this:

 1. Mail problem
 2. Mail problem, SMS problem

 recovery:

 1. SMS recovery.

 Config:

 # SMS

 define serviceescalation {
  host_name *
  service_description *
  first_notification 2
  last_notification 3
  notification_interval 0
  contacts oncall
   }

 define hostescalation {
  host_name *
  first_notification 2
  last_notification 3
  notification_interval 0
  contacts oncall
   }

 # MAIL

 define serviceescalation {
  host_name *
  service_description *
  first_notification 1
  last_notification 1
  notification_interval 10
  contacts sysadmin.reports
   }

 define hostescalation {
  host_name *
  first_notification 1
  last_notification 1
  notification_interval 10
  contacts sysadmin.reports
   }

 i have tried with diffrent last_notifications and so on, but with no
 luck.

 Regards Patrik BÃ¥t.


Try using a separate escalation for the recovery events.  The recovery event 
is the last numbered event so it's hard to catch without a specific 
escalation.  Example:

define serviceescalation {
host_name *
service_description *
first_notification 1
last_notification 0
notification_interval 0
contacts sysadmin.reports,oncall
escalation_options r
}

define hostescalation {
host_name *
first_notification 1
last_notification 0
notification_interval 0
contacts sysadmin.reports,oncall
escalation_options r
}


--
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and 
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Escalating notifications

2011-04-01 Thread Paul M. Dubuc
Same place you put the others.  The thing that makes them only apply to 
recovery events is the

escalation_options r

directive.

Edwin Zoeller wrote:
 This is also what I am looking for. Where would you put the separate 
 escalation?

 -Original Message-
 From: Paul M. Dubuc [mailto:w...@paul.dubuc.org]
 Sent: Friday, April 01, 2011 8:49 AM
 To: Nagios Users List
 Subject: Re: [Nagios-users] Escalating notifications

 Patrik Båt wrote:
 Hello mailinglist!

 im trying to get a notification like this:

 in first hardstate, email staff. (notication 1)

 at the other notification (notification 2) im sending a SMS to the
 oncall.

 But the problem is, that on recovery im only getting a SMS due to the
 sms escalation is in use.

 Anyone have any good way to get this to work?

 1. MAIL Problem
 2. SMS Problem

 On recovery:

 1. Mail Recovery
 2. SMS Recovery

 with 2 escalations, i get like this:

 1. Mail problem
 2. Mail problem, SMS problem

 recovery:

 1. SMS recovery.

 Config:

 # SMS

 define serviceescalation {
   host_name *
   service_description *
   first_notification 2
   last_notification 3
   notification_interval 0
   contacts oncall
  }

 define hostescalation {
   host_name *
   first_notification 2
   last_notification 3
   notification_interval 0
   contacts oncall
  }

 # MAIL

 define serviceescalation {
   host_name *
   service_description *
   first_notification 1
   last_notification 1
   notification_interval 10
   contacts sysadmin.reports
  }

 define hostescalation {
   host_name *
   first_notification 1
   last_notification 1
   notification_interval 10
   contacts sysadmin.reports
  }

 i have tried with diffrent last_notifications and so on, but with no
 luck.

 Regards Patrik BÃ¥t.


 Try using a separate escalation for the recovery events.  The recovery event 
 is the last numbered event so it's hard to catch without a specific 
 escalation.  Example:

 define serviceescalation {
   host_name *
   service_description *
   first_notification 1
   last_notification 0
   notification_interval 0
   contacts sysadmin.reports,oncall
   escalation_options r
 }

 define hostescalation {
   host_name *
   first_notification 1
   last_notification 0
   notification_interval 0
   contacts sysadmin.reports,oncall
   escalation_options r
 }


 --
 Create and publish websites with WebMatrix Use the most popular FREE web apps 
 or write code yourself; WebMatrix provides all the features you need to 
 develop and publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

 --
 Create and publish websites with WebMatrix
 Use the most popular FREE web apps or write code yourself;
 WebMatrix provides all the features you need to develop and
 publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and 
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem: Nagios service check retry interval shorted than configured.

2011-03-11 Thread Paul M. Dubuc
Jelle Smet wrote:
 Why is there only 10 seconds between these pairs of checks?  Sometimes I see 
 a
 20 or 30 second difference sometimes 60 seconds.  Most of them are less than
 30 seconds.  It's very inconsistent.  Any idea what could be causing this?

 Hi Paul,

 I have been looking into this myself the last couple of days.

 Nagios does on demand host checks, the reason for this is explained here
 http://nagios.sourceforge.net/docs/3_0/hostchecks.html
 It basically means Nagios executes the host check when it thinks it needs to 
 do
 so.

 You could alter the cached host check horizon
 (http://nagios.sourceforge.net/docs/3_0/cachedchecks.html) so Nagios does on
 demand checks less frequent and uses older host results instead.

 What I'm personally wondering is whether on demand checks should count as
 retries?  Because this is the case at the moment and it makes the parameter
 'retry_interval' virtually useless.

 Hope this helps,

 Jelle Smet
 http://www.smetj.net

Thanks, I think I understand how this works.  But I'm having this problem with 
service checks, not host checks.  I do have the concurrent service check limit 
set to 30 and I wonder if that is affecting the scheduling of service check 
retries but, if so, I would think it would make the retry interval longer, not 
shorter than specified.  Does anyone know if service check retries are subject 
to the concurrency limit?

Paul Dubuc

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] service notification logged but not done

2011-03-10 Thread Paul M. Dubuc
Could it be that your scripts are stored on an NFS mounted filesystem or other 
networked storage? (What is $USER1$ defined to be?)  If so, maybe you're 
having intermittent problems with access.  Using local storage for the scripts 
will solve this problem.  You might find some evidence of the problem by 
turning on debugging in Nagios and looking at it's debug output (see 
debug_file, debug_level and debug_verbosity parameters in nagios.cfg.

Hope this helps,
Paul Dubuc

MAYER Hans wrote:
 Dear Chad

 Ø Have you recently upgraded Nagios?

 Yes, I am running Core 3.2.3 since Feb 24^th

 Ø When did you start noticing that it was missing execution runs?

 I noticed the problem month ago. even with version 3.2.1 - therefore I
 made an upgrade to the latest version, to see, if this would fix the
 problem.

 Ø Do you have enough disk space free?

 As I said: 91 % free, only 9 % used

 Ø What are the permissions of the script set to?

 -rwxr-xr-x   1 nagios   nagios  1035 Feb 18 10:17 rshsendsms

 I said, it happens only sometimes. Wrong permissions would result in a
 never working situation.

 Ø Were they recently changed?

 No.

 Ø Have you done any type of software changes with any type of supporting
 packages (i.e. Perl) that could have brought up this issue?

 No, this server is running since Jun 2010 unchanged.

 What happens within Nagios between writing the log-file and executing
 the script ? Something permits to execute the script, but only sometimes.


 Kind regards

 Hans

 *From:* Chad Rhyner [mailto:crhy...@box.net]
 *Sent:* Wednesday, March 09, 2011 6:32 PM
 *To:* Nagios Users List
 *Cc:* MAYER Hans
 *Subject:* Re: [Nagios-users] service notification logged but not done

 Have you recently upgraded Nagios? When did you start noticing that it
 was missing execution runs? Do you have enough disk space free? What are
 the permissions of the script set to? Were they recently changed? Have
 you done any type of software changes with any type of supporting
 packages (i.e. Perl) that could have brought up this issue?

 Here are some thoughts on where I would start looking. Anything that you
 can dig up we can look at more closely to identify a potential cause for
 this issue.

 ~Chad

 On Wed, Mar 9, 2011 at 1:29 AM, MAYER Hans ma...@iiasa.ac.at
 mailto:ma...@iiasa.ac.at wrote:

 Dear all

 Using Nagios since a lot of years, I was starting with one of the first
 versions of “netsaint”, and more than 25 years of experience with UNIX,
 I have now a strange problem I never had before.

 I am running Nagios Core 3.2.3 on Solaris 10 OS. Hardware is M3000 with
 SPARC V9 architecture.

 My problem is, I see sometimes – not always – a service notification in
 the log, but it is not really done.

 Here an example, the entry in the log

 [03-09-2011 09:13:25] SERVICE NOTIFICATION:
 sms_mayer;amazon;DISK/p14amazon;OK;notify-service-by-sms;DISK OK - free
 space: /p14amazon 4531 MB (6% inode=99%):

 Here is the definition for notify-service-by-sms

 # 'notify-service-by-sms' command definition

 define command{

 command_name notify-service-by-sms

 command_line $USER1$/rshsendsms $CONTACTPAGER$ \Info:
 $HOSTALIAS$/$SERVICEDESC$ $SERVICEOUTPUT$ \

 }

 As you see I execute a command named “rshsendsms”. And this are the
 first lines of the shell script:

 :

 # Wed Jan 19 10:12:15 MET 2011 - mayer initial

 # Wed Feb 16 10:11:54 MET 2011 - mayer logging the UID

 # usage:

 # rshsendsms 0043664xxx 'hello world - how are you '

 # info: both types of apostrophes are important

 export PATH LOG NUMBER TEXT ID UID NOTSENT RUNLOG

 PATH=/usr/bin:$PATH

 LOG=/var/adm/rshsendsms.log

 RUNLOG=/var/adm/rshsendsms_run.log

 date '+%y%m%d %H:%M'  $RUNLOG

 The first action I do, I write a log-entry. (91% of the disk is free)
 But in this case I cannot find the entry. The last one is dated with
 110309 06:39, where I received a SMS really. I also switched on the
 process accounting weeks ago. But there is no entry to be found, that
 the shell script was executed.

 I also switched on the debug facility of “syslog”. I can find an
 equivalent entry like in the Nagios log. But there are no other
 messages, that something could be wrong.

 But on other hand I was informed at 06:39 and nothing was changed in the
 meantime. This is not the first time this problem happens. Most of the
 time notification works fine, but sometimes not. This is of course a
 pain as notification is one central functionality of Nagios.

 Any idea where I can start searching for the error ?

 Kind regards

 Hans


 --
 Colocation vs. Managed Hosting
 A question and answer guide to determining the best fit
 for your organization - today and in the future.
 http://p.sf.net/sfu/internap-sfd2d
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 mailto:Nagios-users@lists.sourceforge.net
 

[Nagios-users] Problem: Nagios service check retry interval shorted than configured.

2011-03-09 Thread Paul M. Dubuc
I have nagios core 3.2.3 built on SuSE 11.1 and I've been noticing apparent
problem with service check retries.  The normal check interval is set to 7.5
and the retry interval is set to 1 minute.  I'm seeing entries like this in
the log:

[03-02-2011 16:44:39] SERVICE ALERT:
aps11;Extra_01.20;OK;SOFT;2;SELRC OK

[03-02-2011 16:44:29] SERVICE ALERT:
aps11;Extra_01.20;UNKNOWN;SOFT;1;SELRC UNKNOWN - Timeout (130 sec.)
reached


[03-02-2011 13:28:19] SERVICE ALERT: aps14;Extra_04.15;OK;SOFT;2;SELRC OK

[03-02-2011 13:28:09] SERVICE ALERT: aps14;Extra_04.15;CRITICAL;SOFT;1;SELRC 
CRITICAL

Why is there only 10 seconds between these pairs of checks?  Sometimes I see a 
20 or 30 second difference sometimes 60 seconds.  Most of them are less than 
30 seconds.  It's very inconsistent.  Any idea what could be causing this?

Thanks,
Paul Dubuc

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Q: are service check retries subject to concurrency limit?

2011-03-04 Thread Paul M. Dubuc
Nagios 3.2.3:  I'm wondering if Nagios subjects retries on a check failure to 
the limit set by the max_concurrent_checks parameter in nagios.cfg.  My sense 
is that max_concurrent_checks only applies to checks done during the normal 
check interval.  Does anyone know for sure if that is true?

Thanks,
Paul Dubuc

--
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Trying to develop a new perl plugin

2011-01-24 Thread Paul M. Dubuc
If you're going to be writing many of your own plugins, it might be worth the 
effort to use the Nagios::Plugin modules 
(http://search.cpan.org/~tonvoon/Nagios-Plugin-0.35/lib/Nagios/Plugin.pm). 
They're probably installed under the perl/lib subdirectory of your Nagios 
installation.  Among other things, they provide a convenient wrapper for 
Getopt::Long and Params::Validate so you can do position independent options 
and argument validation for your Perl plugins.  Using it could save you quite 
a bit of time and code maintenance headaches in the long run.

Paul Dubuc

Nibin VM wrote:
 Thanks for your reply folks.. :)

 Finally I have concluded that the portion which reads the argument has
 issues.

 $host=$ARGV[0];

 It isn't taken correctly when its executed from nagios. Please somebody
 tell me what code should I put if I need to specify the host name like
 ./test.pl http://test.pl -H hostname?

 On Sun, Jan 23, 2011 at 10:44 PM, Boyer, Timothy A.
 timothy.bo...@opm.gov mailto:timothy.bo...@opm.gov wrote:

 Permissions problem?  You're running the command line as root; try
 running the command line as your Nagios username.
 
 From: Nibin VM [nibin...@piserve.com mailto:nibin...@piserve.com]
 Sent: Sunday, January 23, 2011 10:46 AM
 To: nagios-users@lists.sourceforge.net
 mailto:nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] Trying to develop a new perl plugin

 Hello guys,

 I am trying to write some nagios perl plugin to monitor some
 services  I'm responsible for. Initially I tried to write  custom
 plugin to monitor mail queue using the following script.

 ===
 #!/usr/bin/perl -w

 use strict;
 use Net::SNMP;
 use Getopt::Long;

 use lib /usr/lib64/nagios/libexec;
 my %ERRORS=('OK'=0,'WARNING'=1,'CRITICAL'=2,'UNKNOWN'=3);
 my $host = undef;
 my $result = undef;
 my @array = undef;

 $host=$ARGV[0];
 $result=`/usr/lib64/nagios/plugins/check_snmp -H $host -C
 community -o extOutput.1`;

 @array = split(/\ /, $result);
 chomp($array[3]);

 if ( $array[3] le 1 )
 {
   print OK: current emails queue is $array[3]\n;
   exit $ERRORS{OK};
 }

 elsif ( $array[3] ge 2   $array[3] le 2 )
 {
   print Warning: current emails queue is $array[3]\n;
   exit $ERRORS{WARNING};
 }

 elsif ( $array[3] ge 3 )
 {
   print Critical: current emails queue is $array[3]\n;
   exit $ERRORS{CRITICAL};
 }

 else
 {
   print Unknown;
   exit $ERRORS{UNKNOWN};
 }
 

 As you can see, I use snmp to pull mail queue from the remote
 server. When I try the command from command line it work fine.

 ]# ./test.pl http://test.plhttp://test.pl server name
 OK: current emails queue is 264

 But from the nagios from end it shows as critical and it shows
 Critical: current emails queue is Unknown error :(

 Please somebody help me to sort this out. Obviously its the first
 perl script that  I ever wrote and I really interested to write more
 plugins in perl(I am in love with perl now :) ).

 Thanks in advance!

 --
 Regards,
 Nibin.




 
 --
 Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
 Finally, a world-class log management solution at an even better
 price-free!
 Download using promo code Free_Logger_4_Dev2Dev. Offer expires
 February 28th, so secure your free ArcSight Logger TODAY!
 http://p.sf.net/sfu/arcsight-sfd2d
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 mailto:Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




 --
 Regards,
 Nibin.







 --
 Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
 Finally, a world-class log management solution at an even better price-free!
 Download using promo code Free_Logger_4_Dev2Dev. Offer expires
 February 28th, so secure your free ArcSight Logger TODAY!
 http://p.sf.net/sfu/arcsight-sfd2d



 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
Special Offer-- Download ArcSight Logger 

Re: [Nagios-users] Nagios kept from restarting after reboot by lockfile

2010-12-21 Thread Paul M. Dubuc
eric.b...@barclayscapital.com wrote:


 It's weirdwhen I run nagios and kill it with -9, it leaves the pid
 file  intact, but when I restart it, it zero's out the pid file and starts
  just fine. when I just kill it with the default kill signal, it removes the
  pid file.

This isn't weird.  That's how it should work.  kill -9 sends an uncatchable, 
compulsory, kill signal (SIGKILL) to the process giving it no time to clean up 
before exiting.  The default kill signal is SIGTERM, which can be caught and 
handled (or ignored) by the process.  Restarting Nagios from the web 
interface, doesn't terminate and restart the process (the PID doesn't change), 
only re-initializes it.

--
Forrester recently released a report on the Return on Investment (ROI) of
Google Apps. They found a 300% ROI, 38%-56% cost savings, and break-even
within 7 months.  Over 3 million businesses have gone Google with Google Apps:
an online email calendar, and document program that's accessible from your 
browser. Read the Forrester report: http://p.sf.net/sfu/googleapps-sfnew
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] different contact groups depending on time of day

2010-12-16 Thread Paul M. Dubuc
Mario Garcia Ortiz wrote:
 Hello list,
 is it possible to send notification to a certain contact group depending
 on the time,

 what i mean, send notification (sms) to certain people between working
 hours and to other people outside working hours and weekends.

 thank you


Yes.  Define contact objects with different host_notification_period and 
service_notification_period specifications.

http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#contact

--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] JVM Monitoring

2010-12-16 Thread Paul M. Dubuc
Marc-André Doll wrote:
 Hi list,

 I have to monitor some JVM and I don't find plugins that fit exactly
 with what I want/imagine.

 I could use the check_jmx but I don't really want to install a JRE on my
 Nagios server.

 Currently, I'm monitoring Tomcat servers with check_jmx4perl and I'm
 quite happy with it. Is it possible to configure/tweek the JVM or the
 J4P war to use it on a non-JEE server? Or am I doomed to install java on
 my monitoring server?


 Thanks for your help.

I was just looking at the web page for check_jmx4perl at

http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_jmx4perl/details

It says that it requires No Java installation required on the Nagios host.

Is this not true?

Paul Dubuc

--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check scheduling when checks are inhibited.

2010-11-30 Thread Paul M. Dubuc
Andreas,

Thanks for your reply to my earlier message.  I've done some testing and some 
more thinking on this since then:

On 11/23/2010 03:50 AM, Andreas Ericsson wrote:
 On 11/22/2010 10:41 PM, Paul M. Dubuc wrote:
 We're using Nagios 3.2.3 for simulation of monitoring load in a load test
 environment as well as for monitoring production services.  I've notices some
 interesting behavior in the way Nagios schedules checks when checks are
 inhibited either though the CGI Process Commands or by setting a check_period
 timeperiod that inhibits checks during regularly scheduled down times.

 Normally Nagios seems to spread out host and service checks evenly over time
 but when checks are stopped with the Process Command, Nagios seems to
 reschedule checks so that they are bunched up much closer together.  This
 creates alternating periods of densely scheduled and more sparsely scheduled
 checks that seem to persist when checks are turned on again.  It has a
 noticeable effect in our load testing.  The only way--or the quickest way--to
 get Nagios to smooth out the schedule again is to stop the process completely
 until all the scheduled check times have passed.

 In testing Nagios monitoring of our production services, if I use the
 check_period to inhibit checks during our down times, I notice that as the
 downtime approaches, ALL checks are rescheduled for the exact time that the
 downtime ends (according to the check_period).  This creates a big spike in
 monitoring activity after the downtime.  One way to avoid this, I think, is 
 to
 let checks run during the down times but inhibit notifications instead by
 using the timeperiod to define a notification_period.  But I wonder if this
 bunching up of the schedule when using check_periods is ever a desirable
 behavior.


 I have some plans to make Nagios spread the checks with a randomized 
 interleave
 factor so that a check scheduled to run once every 5 minutes can be run 
 anywhere
 between 4m 30s and 5m 0s after it last ran. The 30 second random-spread would 
 be
 the default and it would otherwise be configurable.

 Another thing worth looking into is to make services to the same host not run
 simultaneously, in case the checked server is expected to be loaded heavily
 it may not play nicely with 30-40 checks fired at it at once.

Here's another suggestion:  An option that would tell Nagios to stagger the 
scheduling of service checks when the check_period resumes.  Instead of 
scheduling all the checks for the exact time that the next check_period 
begins, add an amount of time equal to the time past the check_period ending 
that the service would have run if the check_period hadn't disabled checks.

For example, If I have a check period that is from 9:00 to 17:00 every day.  A 
service running every 5 minutes that runs at 16:57:14 would normally run at 
17:02:14 if the check_period did not end at 17:00.  This check would be 
scheduled to run at 9:02:14 the next day instead of 9:00:00.  This should keep 
all checks staggered by the same amount of time in the schedule once the 
check_period resumes.

I think this would be an ideal solution to the problem.  Using the 
auto_rescheduling options (discussed below) seems to help a little bit but not 
as much as I'd hoped.


 You really should be using scheduled downtime for regular downtime though. 
 There
 are pre-hacked solutions to automagically reschedule re-occurring downtime. 
 Ninja
 supports it out of the box as of the latest version (or possibly latest git).

There are some cases where we really should not be running the checks during 
down times because of the extra load they put on our system when they fail. 
(Checks are still run during down times, if I'm not mistaken, only 
notifications are inhibited.)  Many of our checks fail in this case by timing 
out and they use relatively scarce (shared) and resource intensive processes 
(web browser sessions run under SeleniumRC).  Timeouts tend to be long for 
these checks so there is more contention for these processes when all the 
checks using them start failing, and they're run more often until they all go 
into a 'hard' failure state, etc.  Maybe we can live with this, but it would 
be easier on the system to just inhibit checks we know are going to fail 
during certain regularly scheduled down times.


 These aren't critical issues for us since we can work around them
 procedurally.

 That's good to hear.

   But I wonder if there his a way to prevent the scheduled checks
 from getting bunched together like this if/when you need to inhibit checks 
 for
 a time while keeping Nagios running. Maybe the auto_rescheduling options in
 the nagios.cfg are meant to address this, but they have a potentially 
 negative
 effect on performance according to the comments around them in the file.


 The below text is what I'd call educated speculation after having thrown
 a quick glance at the code. I might be completely wrong, but I don't think
 so

[Nagios-users] check scheduling when checks are inhibited.

2010-11-22 Thread Paul M. Dubuc
We're using Nagios 3.2.3 for simulation of monitoring load in a load test 
environment as well as for monitoring production services.  I've notices some 
interesting behavior in the way Nagios schedules checks when checks are 
inhibited either though the CGI Process Commands or by setting a check_period 
timeperiod that inhibits checks during regularly scheduled down times.

Normally Nagios seems to spread out host and service checks evenly over time 
but when checks are stopped with the Process Command, Nagios seems to 
reschedule checks so that they are bunched up much closer together.  This 
creates alternating periods of densely scheduled and more sparsely scheduled 
checks that seem to persist when checks are turned on again.  It has a 
noticeable effect in our load testing.  The only way--or the quickest way--to 
get Nagios to smooth out the schedule again is to stop the process completely 
until all the scheduled check times have passed.

In testing Nagios monitoring of our production services, if I use the 
check_period to inhibit checks during our down times, I notice that as the 
downtime approaches, ALL checks are rescheduled for the exact time that the 
downtime ends (according to the check_period).  This creates a big spike in 
monitoring activity after the downtime.  One way to avoid this, I think, is to 
let checks run during the down times but inhibit notifications instead by 
using the timeperiod to define a notification_period.  But I wonder if this 
bunching up of the schedule when using check_periods is ever a desirable 
behavior.

These aren't critical issues for us since we can work around them 
procedurally.  But I wonder if there his a way to prevent the scheduled checks 
from getting bunched together like this if/when you need to inhibit checks for 
a time while keeping Nagios running. Maybe the auto_rescheduling options in 
the nagios.cfg are meant to address this, but they have a potentially negative 
effect on performance according to the comments around them in the file.

--
Increase Visibility of Your 3D Game App  Earn a Chance To Win $500!
Tap into the largest installed PC base  get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Macros in notes?

2010-11-16 Thread Paul M. Dubuc
Mark A. Lappin wrote:

 What I would like to do, for my network printers, switches, routers, and
 some other devices, is add more information to the extended info page. I have
 been playing around with notes and to get decently readable output, I end up
 with a bunch of ugly looking HTML which I have been duplicating on every host
 definition. Trying to include printer make, model, print queue, location,
 primary users, toner part number etc; routers nearest service center, circuit
 identifier, etc. Works great, hard to maintain.

 So I was/have been trying (unsuccessfully) to use macros in my host
 definition and on the template put in the more complex HTML that would fill in
 from the macros

 The below configs show what I was attempting. I do not get any
 configuration  warnings, I don't however get the value that I have set in the 
 host, I get the
 literal output: $_HOSTprnMake$. So I'm thinking (1) Nagios doesn't support
 what I'm trying to do and I can't use macros in notes or (2) I have a syntax
 error that I'm not seeing. I'm hoping somebody here can give me some insight
 into which case it might be - especially for #1 before I really start beating
 my head against the wall.

It's #1.  Nagios only supports macro expansion for command objects (maybe 
others I don't know).  Using macro expansions will work in the arguments (if 
any) that you pass to the check_command because they're expanded for the 
command object.

Being able to do what you are trying to do here would be nice.  I would like 
to use macros for constructing host and service names.


 define host{
  use generic-printer
  host_name   11314-AR
  alias   11314-AR-4200N
  address 192.168.98.31
  action_url  http://192.168.98.31
  hostgroups  network-printers
  _prnMakeHP
  _prnModel   Laserjet 2300n
  _prnMainQueue   lmfj-print\\11314-AR
 }


 define host{
  namegeneric-printer ; The name of this host 
 template
  use generic-host; Inherit default values 
 from the generic-host template
  check_period24x7; By default, printers are 
 monitored round the clock
  check_interval  5   ; Actively check the printer 
 every 5 minutes
  retry_interval  1   ; Schedule host check 
 retries at 1 minute intervals
  max_check_attempts  10  ; Check each printer 10 
 times (max)
  check_command   check-host-alive; Default command to 
 check if printers are alive
  notification_period workhours   ; Printers are only 
 used during the workday
  notification_interval   30  ; Resend notifications every 
 30 minutes
  notification_optionsd,r ; Only send notifications 
 for specific host states
  contact_groups  admins  ; Notifications get sent to 
 the admins by default
  register0   ; DONT REGISTER THIS - ITS 
 JUST A TEMPLATE
  notestable border=1 width=100% cellpadding=3 cellspacing=0 
 bgcolor=#FF style=border-collapse: collapse bordercolor=#00\
tr bgcolor=lightbluetd align=centerMake/td/tr\
   trtd align=center$_HOSTprnMake$/td/tr\
  /table
  }


 Any advice/input is very much appreciated.

 --Mark



 Mark A. Lappin, CCNA, MCITP: Enterprise Administrator | Lee Michaels Fine 
 Jewelry
 Director of Information Technology
 11314 Cloverland Ave  | Baton Rouge, LA 70809
 Ph: 225.291.9094 ext 245 | Fax: 225.368.3675  | Mobile:  225-362-2770
 www.lmfj.com


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] No permission to web-interface

2010-11-15 Thread Paul M. Dubuc
Astakhov Peter wrote:
 Hello, colleagues!
 I installed nagios on RHEL6.
 But I get error on web-interface:

 It appears as though you do not have permission to view information for
 any of the hosts you requested...
 If you believe this is an error, check the HTTP server authentication
 requirements for accessing this CGI
 and check the authorization options in your CGI configuration file.

 I checked /etc/httpd/conf.d/nagios.conf

 ScriptAlias /nagios/cgi-bin/ /usr/lib/nagios/cgi-bin/
 ...

Which display are you trying to use when you get this error?  I have one 
instance of Nagios configured with no host groups and this error comes out if 
I try to view host groups.  It's a little confusing since it's not really a 
permission issue since I have permission to access all the hosts. It's just 
that there is nothing to display using that particular query.  There is no 
default all hosts hostgroup.

Paul Dubuc

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios Historical Data Question

2010-11-15 Thread Paul M. Dubuc
Marc Powell wrote:

 On Nov 15, 2010, at 8:05 AM, Korrawit Yindeeyoungyeon wrote:

 Where can I find the standard database schema of Nagios ? or I need to
 find in source code of 3rd party front-end software?

 You'll need to look at the third party software to determine how they get 
 data into a database.

 Nagios doesn't use a database so has no standard database schema. Each addon 
 either has it's own specific schema or utilizes one of other common event 
 broker -  database addons (such as ndoutils).

 --
 Marc

Maybe he's looking for this:

http://nagios.sourceforge.net/docs/ndoutils/NDOUtils_DB_Model.pdf

the DB schema used by NDOUtils.

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Suppress Max concurrent service checks messages.

2010-11-12 Thread Paul M. Dubuc
We're running Nagios 3.2.3 with concurrent service checks set to 40.  We can't 
go much higher than this due to resource constraints outside of Nagios but 
we're running 329 services at 5 minute intervals (this is a load test of 
sorts not production load ... yet).  Average execution time/latency is 36/11 
seconds so we're seeing quite a few messages like this in the Nagios log file:

(Informational Message) [11-11-2010 14:55:57] Max concurrent service checks 
(40) has been reached. Nudging host:service by 9 seconds...

Is there any way to suppress these messages from being logged?  I don't see an 
option for logging these in the config file documentation.

Thanks,
Paul Dubuc

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Suppress Max concurrent service checks messages.

2010-11-12 Thread Paul M. Dubuc
Ton Voon wrote:

 On 12 Nov 2010, at 15:30, Paul M. Dubuc wrote:

 We're running Nagios 3.2.3 with concurrent service checks set to
 40.  We can't
 go much higher than this due to resource constraints outside of
 Nagios but
 we're running 329 services at 5 minute intervals (this is a load
 test of
 sorts not production load ... yet).  Average execution time/latency
 is 36/11
 seconds so we're seeing quite a few messages like this in the Nagios
 log file:

 (Informational Message) [11-11-2010 14:55:57] Max concurrent service
 checks
 (40) has been reached. Nudginghost:service  by 9 seconds...

 Is there any way to suppress these messages from being logged?  I
 don't see an
 option for logging these in the config file documentation.

 I put those messages in.

 Firstly, 40 doesn't necessarily mean there are 40 concurrent service
 checks running as they may have finished but not been reaped yet (to
 decrement the counter).

 Secondly, if you are getting these messages, then either (1) this
 limit is too low - increase and keep an eye of the load on your nagios
 server; (2) you've got too many checks running - reduce frequencies/
 numbers or setup a slave server.

 The trouble with the way the nudging works is that it hides the fact
 that you have latency issues (as the check is rescheduled to a future
 time). This means nagiostats will not include the additional latency
 time here.

 If someone has a better way of working this out, I'm all ears.

 Ton

Thanks, Ton.  This is helpful information and advice.  The services we're 
running require web browsers to run which are a cpu and memory intensive 
resource that, temporarily, we need to manage on the Nagios server.  In 
production we shouldn't have these limitations, but for now I just wanted to 
keep all these messages from flooding the log.

Andreas, I know it's doing things wrong, but there's not much I can do about 
it right now.  Since I know what the problem is that these messages are trying 
to tell me.  I'd just like to keep them from flooding the logs so I can see 
what else is happening more easily.  That's all.

Thanks,
Paul Dubuc

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Suppress Max concurrent service checks messages.

2010-11-12 Thread Paul M. Dubuc
Ton Voon wrote:
...

 The trouble with the way the nudging works is that it hides the fact
 that you have latency issues (as the check is rescheduled to a future
 time). This means nagiostats will not include the additional latency
 time here.

 If someone has a better way of working this out, I'm all ears.

Would it cause other problems if the total nudging time for a service were 
included in its latency time?

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Time frame for Monitoring Performance?

2010-11-10 Thread Paul M. Dubuc
I'm using Nagios 3.2.3.  I'm wondering what time frame is used for the 
measurements shown in the Monitoring Performance box on the Tactical Overview 
display.  In particular, are the execution times (min. max. avg.) measured 
over the last hour, 10 minutes, or what?  I can't find any information on this 
in the documents.

Thanks,
Paul Dubuc

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] any macro for viewing host parent?

2010-11-09 Thread Paul M. Dubuc
John Alberts wrote:
 I would like to have our notification emails for service alerts, include
 the host parent.  Is there any existing macro I can use to include
 this?  I couldn't find anything when googling.  If not, any suggesions
 how I might get it in an email?


The way we do this is to use a user-defined macro in the host definition like 
so:


define host{
 use aps-launcher
 host_name   APS-P52

 parents aps52
 __PARENT_HOST   aps52
}

Then you can expand expand it, $_PARENT_HOST$, in the notification. 
Unfortunately this means you need to define the parent in 2 places.  Would be 
nice if there was built-in macro for this, but I don't think there is.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] any macro for viewing host parent?

2010-11-09 Thread Paul M. Dubuc
Paul M. Dubuc wrote:
 John Alberts wrote:
 I would like to have our notification emails for service alerts, include
 the host parent.  Is there any existing macro I can use to include
 this?  I couldn't find anything when googling.  If not, any suggesions
 how I might get it in an email?


 The way we do this is to use a user-defined macro in the host definition like 
 so:


 define host{
   use aps-launcher
   host_name   APS-P52

   parents aps52
   __PARENT_HOST   aps52
 }

 Then you can expand expand it, $_PARENT_HOST$, in the notification.

I mean that would be $_HOST_PARENT_HOST$

 Unfortunately this means you need to define the parent in 2 places.  Would be
 nice if there was built-in macro for this, but I don't think there is.

--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] any macro for viewing host parent?

2010-11-09 Thread Paul M. Dubuc
diego.roc...@gmail.com wrote:
 Isn't it $_HOSTPARENT_HOST$ ?

Not if you put TWO underscores in front of the macro name.  Then you get 
$_HOST_PARENT_HOST$ which I think is much more readable (a nice suggestion I 
found in Barth's book.)

 btw, in order to avoid the double declaration (and human errors) you
 could add in generic-host (ot whatever template you define)

 define generic-host {
 ...
 parents $_HOSTPARENT_HOST$
 }

 and in the real host definition you will define only the custom macro.
 Haven't tried it, but it should work

I don't think this will work because the macro isn't expanded in that context. 
  I think they only expand in the command object or (effectively) in arguments 
in the check_command definition (because their expanded when passed to the 
command).

Even if this did work it would work if all your hosts had the same parent. 
All my hosts have different parents.


 On Tue, Nov 9, 2010 at 7:55 PM, Paul M. Dubucw...@paul.dubuc.org  wrote:
 Paul M. Dubuc wrote:
 John Alberts wrote:
 I would like to have our notification emails for service alerts, include
 the host parent. Â Is there any existing macro I can use to include
 this? Â I couldn't find anything when googling. Â If not, any suggesions
 how I might get it in an email?


 The way we do this is to use a user-defined macro in the host definition 
 like so:


 define host{
 use   aps-launcher
 host_name APS-P52

 parents aps52
 __PARENT_HOST   aps52
 }

 Then you can expand expand it, $_PARENT_HOST$, in the notification.

 I mean that would be $_HOST_PARENT_HOST$

 Unfortunately this means you need to define the parent in 2 places. It 
 Would be
 nice if there was built-in macro for this, but I don't think there is.



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Best Practice: Forgotten Acknowledgements

2010-11-01 Thread Paul M. Dubuc
Andre Timmermann wrote:
 Am Montag, den 01.11.2010, 12:50 -0400 schrieb Chris Beattie:
 Acknowledgements add comments to hosts and services, so you could just
 set yourself a reminder to occasionally check the comments link in the
 side bar and look for anything that's getting stale.

 Yes, but this would enforce a human not to forget things. I tend to
 believe something automatic is more reliable than a human ;)

You could write an event handler that fixes whatever the problem is. 
Otherwise you are relying on a human at every level not to forget the 
acknowledgments and reminders. ;-)

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] host_port objects - Enhancement Request

2010-10-28 Thread Paul M. Dubuc
I think this would be a very nice enhancement.  Many of the services we run 
are associated with a host and a port.  We're using the service-based ports 
solution that you describe.  Since Nagios requires that the combination 
host_name and service_description be unique, we often have to embed a port 
name in the service_description.  Since the port is also passed as an argument 
to the check_command, it ends up being defined in two places and the 
service_description has to be changed when we change the port being used for 
the service.  Having to configure a separate service for each port on a given 
host also complicates configuration changes.


--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Services on statusmap

2010-10-20 Thread Paul M. Dubuc
Laszlo Csepanyi-Furjes wrote:
 Hi,

 I'm using Nagios core 3.2.3. I have couple of hosts in my system and there 
 are web services installed in every hosts that I would like to monitor. I 
 implemened own plug-in for that purpose. So far the configuration is going 
 well. But now I'm bumping my head into the wall. In the statusmap I can see 
 only the defined hosts. How can I get the services visible there? Is it 
 possible at all with the core version or do I have to install something extra?

 At least I found this picture: http://a9k.info/images/nagios.png
 It contains Chat, Staff, etc those should be services, right?

 Please help!


I think the status map is only for hosts.  The Up and Down status you see 
on the diagram apply to hosts, not services.  (So Chat and Staff must be 
hosts.)  Service status is OK, Warning Critical, etc.  You can select a 
hosts (click) and double-click on it host to display its service status details.

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] question about macros

2010-10-19 Thread Paul M. Dubuc

Joel Brooks wrote:
 hey gang,

 can macros be used in configuration objects?

 i.e. can i use $HOSTNAME$ in the display_name directive on a host object?


That would be nice, but I don't think you can.  You can use them in the 
command arguments in the check_command directive though.

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] escalation question

2010-10-11 Thread Paul M. Dubuc
Terry wrote:
 On Mon, Oct 11, 2010 at 3:48 AM,
 michal.lacko...@cz.schneider-electric.com  wrote:
 Hi All,

 Is there any way how to create service escalation in the following way:

 hostgroup_nameGroup1,Group2
 service_description*
 contact_groupManagers

 Basically I would need to escalate all service problems on the hosts which
 are members of Group1 and Group2 to the managers.

 thanks in advance
 Michal

 --

 Yes, you're exactly right.  We took it a step further and put all
 hosts in a single group then globbed it as you did above:

 define serviceescalation{
  hostgroup_name  allhosts
  service_description .*
  contactsfoo,foo2
  first_notification  1
  last_notification   1
  notification_interval   1
  escalation_options  w,u,c
  }
 define hostgroup {
  hostgroup_name  allhosts
  alias   allhosts
  members .*
  }
 use_regexp_matching=1

 I think that's all you need to enable globbing.

Thanks for this example.

I'm trying to do something similar with an allhosts hostgroup definition and 
it doesn't seem to work unless all hosts in the allhosts group also have 
services defined for them.  In this case I get an error like

Error: Could not find a service matching host name 'AXSP51' and description 
'.*' (config file 
'/vol/omni/nagios-3.2.1/config/test/objects/contacts/Contacts.cfg', starting 
on line 74)
Error: Could not expand services specified in service escalation (config file 
'/vol/omni/nagios-3.2.1/config/test/objects/contacts/Contacts.cfg', starting 
on line 74)

AXSP51 has no services defined for it, but I monitor it as a parent for hosts 
that do.  Do I need to maintain a host group to use instead of allhosts just 
for the hosts that have services defined for them, or is there a more 
convenient (i.e., less error prone) way around this?

Thanks,
Paul Dubuc


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Do plugins terminate gracefully on Nagios restart or shutdown?

2010-08-26 Thread Paul M. Dubuc
I'm wondering if there is any termination signal sent to a plugin that happens 
to be executing at the time Nagios is restarted or shut down?  Do plugins need 
a signal handler for this case if they have cleanup that needs doing?  Do 
plugins using the embedded Perl also get a signal?  Is the signal different 
for restart vs. shutdown?

Or perhaps Nagios waits for plugins that are executing to finish while not 
starting any before doing the restart or shutdown.

I haven't found the answer to this in the development guidelines or other 
documentation.

Can anyone tell me how this is handled?

Thanks,
Paul Dubuc

--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Plugin termination signal?

2010-08-04 Thread Paul M. Dubuc
I'm wondering if there is any termination signal sent to a plugin that happens 
to be executing at the time Nagios is restarted or shut down?  So plugins need 
a signal handler for this case if they have cleanup that needs doing?  Do 
plugins using the embedded Perl also get a signal?  Is the signal different 
for restart vs. shutdown?

I haven't found the answer to this in the development guidelines.

Thanks,
Paul Dubuc

--
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Running a command when Nagios config changes

2010-07-15 Thread Paul M. Dubuc
Ryan C Ash wrote:



 Paul M. Dubuc wrote

 I would like to have some way of running a command only when Nagios is
 started, or is restarted from the Process Commands menu, or any time
 Nagios reloads its configuration files.  Is there a way to do this?  I
 thought about writing it as a localhost service plugin that simply does
 nothing if

 $LASTSERVICECHECK$  $PROCESSSTARTTIME$

 but that doesn't seem optimal.  It this the best solution?  It would be
 nice if I could write it as an event handler, but events are only for
 host or service state changes.  This is a Nagios process state change.

 We run on redhat linux and I use a common init script in
 /etc/rc.d/init.d/nagios.  That would be an easy place to add that
 additional script.  Currently it maintains our pnp4nagios, nsca
 listener, ndoutils, etc.

Thanks for your response.

What I really need is something that will run my script anytime Nagios reads 
its config files (possible configuration change) so this is only a partial 
solution.  Executing the Restart the Nagios process process command from the 
Process Info screen doesn't create a new Nagios process (it has the same PID 
after the restart), but it does cause Nagios to reload the configuration and 
resets the $PROCESSSTARTTIME$ macro value.  The localhost plugin I describe 
above does the job, but I wouldn't be able to guarantee that it would run 
promptly after the restart event and there seems to be no way to have it just 
run once instead of at intervals.  This isn't a problem for my current use so 
I can keep doing it this way.

It would be nice for Nagios to have a process restart (or config changed) 
event for which one cold write an event handler script.


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] How to find active check status for a service?

2010-07-15 Thread Paul M. Dubuc
Is there some programmatic way to find out whether or not active checks are 
enabled or disabled for a service in Nagios.   We have a requirement for an
audit to provide notifications for certain critical services that may have 
their active checks disabled so they aren't left that way any longer than 
necessary.

Thanks,
Paul Dubuc

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] How to find active check status for a service?

2010-07-15 Thread Paul M. Dubuc
Holger Weiß wrote:
 * Paul M. Dubucw...@paul.dubuc.org  [2010-07-15 17:08]:
 Is there some programmatic way to find out whether or not active checks are
 enabled or disabled for a service in Nagios.   We have a requirement for an
 audit to provide notifications for certain critical services that may have
 their active checks disabled so they aren't left that way any longer than
 necessary.

 You could parse the status file, see

 ftp://ftp.in-berlin.de/pub/users/weiss/nagios/tools/disabled-notifications

 for an example.

Thanks!  Using MK Livestatus 
(http://mathias-kettner.de/checkmk_livestatus.html) is also a possibility.

Paul Dubuc

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Running a command when Nagios config changes

2010-07-14 Thread Paul M. Dubuc
I would like to have some way of running a command only when Nagios is 
started, or is restarted from the Process Commands menu, or any time Nagios 
reloads its configuration files.  Is there a way to do this?  I thought about 
writing it as a localhost service plugin that simply does nothing if

$LASTSERVICECHECK$  $PROCESSSTARTTIME$

but that doesn't seem optimal.  It this the best solution?  It would be nice 
if I could write it as an event handler, but events are only for host or 
service state changes.  This is a Nagios process state change.

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] using multiple templates

2010-07-10 Thread Paul M Dubuc
Litwin, Matthew wrote:
 Are there any consequences to using multiple templates other than that the 
 last one defined gets precedence? I would like to have sevice templates the 
 do things like define notifications interval separately from escalation path, 
  time periods etc

 I was thinking of ending up with something like this:

 define{
   namesome_services
   use an_escalation_template
   use a_notification_template
   use an_action_template
   
   }

 Assuming there is no collisions in namespace, this should work, right?

Have you tried it?  I don't know if separate 'use' directives work.  I 
use a comma separated list with one 'use' directive:

use an_escalation_template,a_notification_template,an_action_template

Remember that the order is important.  Anything defined in the first 
template takes precedence.  See Multiple Inheritance Sources here for 
more details:
http://nagios.sourceforge.net/docs/3_0/objectinheritance.html

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] groups of hostgroups?

2010-06-25 Thread Paul M. Dubuc
Litwin, Matthew wrote:
 It doesn't appear that there is a way to have a way to include hostgroups
 in other hostgroups, but is there some other way to get this behavior?
 Since my environment has several dozen types of servers in our environment,
 it would be helpful to define a class of host somehow rather than having
 servers be listed explicitly in multiple hostgroups. Any ideas?

I use templates to add hosts and services to groups.  If the definition
inherits from more than one template the 'hostgroups' or 'servicegroups'
specifier will replace whatever was specified previously unless you prefix
the group name with a plus sign (+).  Then it adds the group to whatever other 
groups are specified:

define hostgroup{

 hostgroup_name   HG_ALPHA
...
}


define host{

 namealpha-host

 register0   ; this is a template


 hostgroups   +HG_ALPHA
...
}

define hostgroup{

 hostgroup_name   HG_BETA
...
}

# # Nagios service definition template used by services in this config file
#
define host{

 namebeta-host

 register0   ; this is a template


 use alpha-host

 hostgroups   +HG_BETA
}

Now any host that uses the beta-host template is put in both the HG_BETA 
hostgroup and the HG_ALPHA hostgroup.  This effectively puts the HG_BETA group
within the HG_ALPA group.

Hope this helps.

Same thing can be done with servicegroups of course.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] nrpe configuration help

2010-06-07 Thread Paul M. Dubuc
Could you define one wrapper service that executes one of the others based on 
an argument passed to it?

shadih rahman wrote:
 All,
 I need some suggestion for nrpe configuration.  I have 3 different
 kind of architecture in my setup.  I have 32 bit linux machine (plugins
 installed at /usr/lib/nagios/plugins directory) , 64 bit linux machine
 (plugins installed /usr/lib64/nagios/plugins directory), solaris machine
 (plugins installed at /opt/libexec directory)

 In my nrpe.conf file I would three definitions like below

 [check_something]=/usr/lib/nagios/plugins/check_something
 [check_something_x64]= /usr/lib64/nagios/plugins/check_something
 [cehck_something_unix]=/opt/libexec/check_somthing


   in my service definition, I would name them differently and call the
 command file,  for example I would have a check disk, disk_x64,
 disk_unix.   In commands.cfg file I would call them like

 command_name check_remote
 command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$



 However, now new requirements came in, where disk, disk_x64, disk_unix
 must have same service name.  I need to find a clever way define service
 disk and call different nrpe command based on architecture.   Can
 someone please help me with this.  Thanks


 --
 Cordially,
 Shadhin Rahman



 --
 ThinkGeek and WIRED's GeekDad team up for the Ultimate
 GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
 lucky parental unit.  See the prize list and enter to win:
 http://p.sf.net/sfu/thinkgeek-promo



 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] servicegroups directive doesn't seem to work

2010-02-25 Thread Paul M. Dubuc
FYI, The reason this wasn't working was that there was 'use' directive in the 
service template that was using a template that also has a servicegroups 
directive for another service group (that line got edited out of my example).
Putting a + sign in front of the ebusiness servicegroup name did the trick, 
adding the new service group instead of using it to replace the old one.

Paul M. Dubuc wrote:
 Hello,
 
 The documentation at 
 http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html for 
 Service Definition says that you can use a 'servicegroups' directive to 
 assign a service to a servicegroup instead of using the 'members' 
 directive in the service group:
 
 *servicegroups*: This directive is used to identify the /short 
 name(s)/ of the servicegroup(s) 
 http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#servicegroup 
 that the service belongs to. Multiple servicegroups should be 
 separated by commas. This directive may be used as an alternative to 
 using the /members/ directive in servicegroup 
 http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#servicegroup 
 definitions.
 
 I would like to do this using a service template that service 
 definitions can use to do the assignment like the configuration below.  
 This would save me from having to add many host,service pairs to the 
 members directive in the service group.  But it doesn't seem to work 
 (I'm using Nagios 3.2.0).  I get the following configuration error:
 
 Error: Servicegroup members must be specified in 
 host_name,service_description pairs (config file ' ...
 
 I get the same error when I delete the service template and move the 
 servicegroups directive into the service definitions.
 What am  I doing wrong?
 
 Thanks,
 Paul Dubuc
 
 
 define servicegroup{
 servicegroup_name   ebusiness
 alias   Business Services
 #  members   ; use servcicegroups in service definitions below instead.
  }
 
 #
 # Nagios service definition template used by services in this config file
 #
 define service{
 nameebusiness-service
 register0   ; this is a template
 
 servicegroups   ebusiness   ; add the service to this 
 service group
 
 }
 
 
 define service{
 use ebusiness-service
 host_name   host1,host2
 service_description service1
 
 check_command  ...
 
 }
 
 #
 # SciFinder Password Change test service
 #
 define service{
 use ebusiness-service
 host_name   host1,host2
 service_descriptionservice2
 
 check_command   ...
 
 }
 



--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] servicegroups directive doesn't seem to work

2010-02-24 Thread Paul M. Dubuc
Hello,

The documentation at 
http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html for 
Service Definition says that you can use a 'servicegroups' directive to 
assign a service to a servicegroup instead of using the 'members' 
directive in the service group:

 *servicegroups*: This directive is used to identify the /short 
 name(s)/ of the servicegroup(s) 
 http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#servicegroup 
 that the service belongs to. Multiple servicegroups should be 
 separated by commas. This directive may be used as an alternative to 
 using the /members/ directive in servicegroup 
 http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#servicegroup 
 definitions.

I would like to do this using a service template that service 
definitions can use to do the assignment like the configuration below.  
This would save me from having to add many host,service pairs to the 
members directive in the service group.  But it doesn't seem to work 
(I'm using Nagios 3.2.0).  I get the following configuration error:

Error: Servicegroup members must be specified in 
host_name,service_description pairs (config file ' ...

I get the same error when I delete the service template and move the 
servicegroups directive into the service definitions.
What am  I doing wrong?

Thanks,
Paul Dubuc


define servicegroup{
servicegroup_name   ebusiness
alias   Business Services
#  members   ; use servcicegroups in service definitions below instead.
 }

#
# Nagios service definition template used by services in this config file
#
define service{
nameebusiness-service
register0   ; this is a template

servicegroups   ebusiness   ; add the service to this 
service group

}


define service{
use ebusiness-service
host_name   host1,host2
service_description service1

check_command  ...

}

#
# SciFinder Password Change test service
#
define service{
use ebusiness-service
host_name   host1,host2
service_descriptionservice2

check_command   ...

}


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] How to access user-defined service variables in a command object

2010-02-08 Thread Paul M. Dubuc
I'm trying to integrate the use of an internally developed alarm 
generation command into our Nagios configuration.  So I want to define 
an Nagios command object that calls this command with arguments specific 
to the service that is generating the status condition that generates 
the alarm.  One of the arguments is an alarm number.  I can set this 
number in the service definition as a user defined variable:

define service{
...
__ALARM_NUMBER   123
}

Is it possible to access this variable in the command definition using 
on-demand macros?  I tried to do this in the following way, but it 
doesn't seem to work:

define command{
command_namenotify-service-by-alarm
command_line/usr/local/bin/sendalarm $HOSTALIAS$ 
$_SERVICE_ALARM_NUMBER:HOSTNAME:SERVICEDESC$ $SERVICESTATE$ 
$SERVICEDESC$ $SERVICEOUTPUT$
}

Is there an alternative?

Thanks,

Paul M. Dubuc

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] How to access user-defined service variables in a command object

2010-02-08 Thread Paul M. Dubuc
I should have made more clear what I am trying to do below.  I know I can 
access the service __ALARM_NUMBER from the command definition by giving the 
literal host_name and service description like this (I've updated the service 
definition in my previous example to illustrate):

$_SERVICE_ALARM_NUMBER:localhost:DUMMY

but I would like the command definition to be able to do this using the macro 
names $HOSTNAME$ and $SERVICEDESC$ so that one command definition works for 
all services that use it for notification.  Is there a way to do this?  I 
would not like to have to define a separate command and contact group for 
every alarm number.

Also, I'm using Nagios 3.2.0.

Thanks,
Paul Dubuc

Paul M. Dubuc wrote:
 I'm trying to integrate the use of an internally developed alarm 
 generation command into our Nagios configuration.  So I want to define 
 an Nagios command object that calls this command with arguments specific 
 to the service that is generating the status condition that generates 
 the alarm.  One of the arguments is an alarm number.  I can set this 
 number in the service definition as a user defined variable:
 
 define service{
  host_name localhost
  service_description DUMMY
 ...
 __ALARM_NUMBER   123
 }
 
 Is it possible to access this variable in the command definition using 
 on-demand macros?  I tried to do this in the following way, but it 
 doesn't seem to work:
 
 define command{
 command_namenotify-service-by-alarm
 command_line/usr/local/bin/sendalarm $HOSTALIAS$ 
 $_SERVICE_ALARM_NUMBER:HOSTNAME:SERVICEDESC$ $SERVICESTATE$ 
 $SERVICEDESC$ $SERVICEOUTPUT$
 }
 
 Is there an alternative?
 
 Thanks,
 
 Paul M. Dubuc
 

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] How to access user-defined service variables in a command object

2010-02-08 Thread Paul M. Dubuc
Sorry to have bothered the list.  I was making the problem too hard because I 
was confused by what I'd read about on demand macros in Barth's book (p. 632). 
  Using $_SERVICE_ALARM_NUMBER$ works in the command definition.  I don't know 
why I didn't try that first.  For some reason I thought you had to specify the 
host and service description to get the value of the variable.

Paul Dubuc

Paul M. Dubuc wrote:
 I should have made more clear what I am trying to do below.  I know I can 
 access the service __ALARM_NUMBER from the command definition by giving the 
 literal host_name and service description like this (I've updated the service 
 definition in my previous example to illustrate):
 
 $_SERVICE_ALARM_NUMBER:localhost:DUMMY
 
 but I would like the command definition to be able to do this using the macro 
 names $HOSTNAME$ and $SERVICEDESC$ so that one command definition works for 
 all services that use it for notification.  Is there a way to do this?  I 
 would not like to have to define a separate command and contact group for 
 every alarm number.
 
 Also, I'm using Nagios 3.2.0.
 
 Thanks,
 Paul Dubuc
 
 Paul M. Dubuc wrote:
 I'm trying to integrate the use of an internally developed alarm 
 generation command into our Nagios configuration.  So I want to define 
 an Nagios command object that calls this command with arguments specific 
 to the service that is generating the status condition that generates 
 the alarm.  One of the arguments is an alarm number.  I can set this 
 number in the service definition as a user defined variable:

 define service{
   host_name localhost
   service_description DUMMY
 ...
 __ALARM_NUMBER   123
 }

 Is it possible to access this variable in the command definition using 
 on-demand macros?  I tried to do this in the following way, but it 
 doesn't seem to work:

 define command{
 command_namenotify-service-by-alarm
 command_line/usr/local/bin/sendalarm $HOSTALIAS$ 
 $_SERVICE_ALARM_NUMBER:HOSTNAME:SERVICEDESC$ $SERVICESTATE$ 
 $SERVICEDESC$ $SERVICEOUTPUT$
 }

 Is there an alternative?

 Thanks,

 Paul M. Dubuc

 
 --
 The Planet: dedicated and managed hosting, cloud storage, colocation
 Stay online with enterprise data centers and the best network in the business
 Choose flexible plans and management services without long-term contracts
 Personal 24x7 support from experience hosting pros just a phone call away.
 http://p.sf.net/sfu/theplanet-com
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue. 
 ::: Messages without supporting info will risk being sent to /dev/null


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null