Re: [Nagios-users] Alerting

2013-08-22 Thread Claudio Kuenzler
On Thu, Aug 22, 2013 at 1:26 AM, Charles Rice cr...@akassociates911.com wrote:
 you need to put in the config files of the nodes connected to the switch
 that the switch is a parent device. I do not have the syntax in front of me,
 but I think it is just
 parentdevice name

It's parents, just for the sake of completeness.

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Alerting

2013-08-21 Thread Jeremy Gibbs
Now, I know this can be done but here is the question.

Say our core switch goes down, I obviously don't want to be alerted about
every single device that has subsequently gone down as well.  I know since
the core is down, everything is down.  How do I setup these types of
relationships so alerting is dependent other another object being up.

I would get alerts because we are using a SMS gateway and it would be able
to send SMS messages to our cellphones.


Thanks


*--

Jeremy L. Gibbs*
Systems Administrator / Network Engineer
Utica College IITS
--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Alerting

2013-08-21 Thread Charles Rice
you need to put in the config files of the nodes connected to the switch that 
the switch is a parent device. I do not have the syntax in front of me, but I 
think it is just
parentdevice name

Yours,

Charles Rice
911 Specialist

This message (and any files transmitted with it) is intended only for the use 
of the person or organization to which it is addressed, and may contain 
information that is privileged, confidential and exempt from disclosure under 
applicable law. If the reader of this message is not the intended recipient, or 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by email or telephone and delete 
the original message immediately. Thank you.

From: Jeremy Gibbs [jlgi...@utica.edu]
Sent: Wednesday, August 21, 2013 6:19 PM
To: nagios-users
Subject: [Nagios-users] Alerting

Now, I know this can be done but here is the question.

Say our core switch goes down, I obviously don't want to be alerted about every 
single device that has subsequently gone down as well.  I know since the core 
is down, everything is down.  How do I setup these types of relationships so 
alerting is dependent other another object being up.

I would get alerts because we are using a SMS gateway and it would be able to 
send SMS messages to our cellphones.


Thanks


--

Jeremy L. Gibbs
Systems Administrator / Network Engineer
Utica College IITS

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Alerting based on past-to-current trends?

2010-12-10 Thread Jim Avery
On 6 December 2010 19:02, Ian Ehrenwald iehrenw...@tripadvisor.com wrote:
 Hello
 I was wondering if there was a straight-forward way to alert based on an 
 average of past data plus a current perfdata entry.  I understand I'm not 
 explaining it very well that way, so here is the real-world example I am 
 working with -

 I am polling a set of machines via SNMP for CPU load every 1 minute (looking 
 at hrProcessorLoad).  If the return value is at or above 95%, send out a 
 WARNING.  If the return value is 98% or above, send out a CRITICAL.  The 
 problem here is that it's OK for a process to take up 100% CPU for multiple 
 seconds, and sometimes that high CPU usage coincides with the SNMP %CPU 
 query, so I get a lot of false alerts.

 Is there a way to use past perfdata in conjunction with the current returned 
 data to generate an average and send a WARNING or CRITICAL based on that new 
 number?  I only care to get alerted from Nagios if, for example, the %CPU has 
 been at 100% for 5 minutes.  Or am I just way over-thinking this and should 
 be monitoring 1m, 5m, 15m UNIX load averages (which doesn't seem that 
 accurate anyway)?  What are other people doing to monitor CPU usage and alert 
 on abnormal long periods of utilization?


Nagios will alert as soon as the plugin returns a non-OK status.  You
can of course configure max_check_attempts and/or
first_notification_delay so that Nagios won't send a notification
until after a given time, but this won't stop it from appearing on on
the web page for problem services straight away.

It would be great if you could get Nagios to display only hard status
alerts - I don't think you can though, not with ordinary Nagios Core
anyway.  Some of the third-party Nagios front ends will do it, for
example you can configure the icons in NagVis only to display hard
alerts.

Cheers,

Jim

--
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting based on past-to-current trends?

2010-12-10 Thread Jim Avery
On 10 December 2010 18:43, Rick Carter rick.car...@umich.edu wrote:
 Hi Jim,

 I'm wondering if load average would get you where you want to be, as in a lot 
 of cases, a CPU busy might not be a big deal unless the run queue is growing.

 My nagios-fu isn't good enough to tell you how to get that, but when I saw 
 your message, I thought right away of the linux/unix:

 $ uptime
 13:41  up 2 days, 18:11, 2 users, load averages: 0.31 0.25 0.24

 Where the 2nd load average is the 5-minute one.

 - Rick

Good point Rick,

there is a check_load plugin, and you could indeed set appropriate
thresholds to make it concentrate on the 15-minute value rather than
the 5-minute or 1-minute values.

As to what 'load' actually means I'm not 100% sure.  I've read
http://www.teamquest.com/resources/gunther/display/5/index.htm a few
times, and think it helps a bit!  I even bought Gunther's book
Guerilla Capacity Planning but confess I haven't read anywhere near
all of it.

I seem to recall reading somewhere that as a general rule of thumb if
load is  2 * the number of cpus, it's probably affecting performance.
 Certainly on my own Nagios server with 4 CPUs I find it's struggling
whenever load is consistently  10.

Cheers,

Jim

--
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Alerting behavior at beginning of timeperiod

2009-02-26 Thread Steven Kreuzer
I have several processes that are started each morning from cron and  
then run until the early evening and are then killed.

For example, every weekday at 8:00am, a daemon is started and it runs  
until 6:30pm. A timeperiod of this particular process has been created
so between 08:00 and 18:30, nagios uses nrpe to check to make sure the  
process is in the process list and if not, it sends out an alert. For  
the
most part it works exactly as expected with the exception of the alert  
that is thrown in the morning.

I have been getting an alert each day that is timestamped a couple of  
seconds after 8:00am (Today was sent out at 8:00:06)
My guess as to what happens is that at exactly 8am the first check is  
done and the process might not have been fully started, or cron
started it a few seconds after the check is done.

However, I have nagios setup so that normal checks are scheduled to be  
performed every 5 minutes. If a check fails, another check is scheduled
for 1 minute after the first failed check and then if that check  
fails, an alert is sent out.

Nagios appears to be ignoring that. My guess as to what happens is  
that if the first check at the start of a timeperiod fails, it  
immediately sends out a alert. The issue seems to have
gone away after I changed the timeperiod to begin at 8:01am but I  
wanted to pick the brain of the community to see if this is an  
expected behavior or
something I need to look into more closely.

Many Thanks

--
Steven Kreuzer
http://www.exit2shell.com/~skreuzer


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Alerting on 100% cpu for a period of time

2008-08-05 Thread Lars Jørgensen
Hi.

Is it possible to alert when a windows host has been running af 100% cpu for, 
say 20 minutes?


--
Lars


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting on 100% cpu for a period of time

2008-08-05 Thread Richard Quintin
Anything is _possible_. :)

The smart folks on this list can probably suggest something better,
but one option would be to have a sar process logging cpu usage to a
file and then an NRPE check to look at the values in that file.

On Tue, Aug 5, 2008 at 4:19 AM, Lars Jørgensen [EMAIL PROTECTED] wrote:
 Hi.

 Is it possible to alert when a windows host has been running af 100% cpu for, 
 say 20 minutes?


 --
 Lars


 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
 Build the coolest Linux based applications with Moblin SDK  win great prizes
 Grand prize is a trip for two to an Open Source event anywhere in the world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




-- 
Richard Quintin, DBA
Database  Application Administration
Virginia Tech

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting on 100% cpu for a period of time

2008-08-05 Thread Palle L Jensen
Lars,

I am not sure if this is what you are looking for.
We use NSclient++ to monitor windows hosts and this is our setup for
monitoring the Cpu Load on windows hosts.

#Command Definition for CPU Load (were $USER7 is a macro for the pwd to
#access the windows host):

define command{
command_namecheck_nt_cpu
command_line$USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v
CPULOAD -l $ARG1$ -s $USER7$
}


#This service definition will generate a critical alert if the 10-minute CPU
#load is 90% or more or a warning alert if the 10-minute load is 80% or
#greater. Just change the 10,80,90 as you please to fit your monitoring.

Service definition:
define service {
use generic-service
host_name   thehost001
service_description Cpu
servicegroups   cpu-load
check_command   check_nt_cpu!10,80,90

Hope it helps.

Thanks,
Palle



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Richard Quintin
 Sent: Tuesday, August 05, 2008 8:23 AM
 To: Lars Jørgensen
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Alerting on 100% cpu for a period of time
 
 Anything is _possible_. :)
 
 The smart folks on this list can probably suggest something better,
 but one option would be to have a sar process logging cpu usage to a
 file and then an NRPE check to look at the values in that file.
 
 On Tue, Aug 5, 2008 at 4:19 AM, Lars Jørgensen [EMAIL PROTECTED] wrote:
  Hi.
 
  Is it possible to alert when a windows host has been running af 100% cpu
 for, say 20 minutes?
 
 
  --
  Lars
 
 
  
 -
  This SF.Net email is sponsored by the Moblin Your Move Developer's
 challenge
  Build the coolest Linux based applications with Moblin SDK  win great
 prizes
  Grand prize is a trip for two to an Open Source event anywhere in the
 world
  http://moblin-contest.org/redirect.php?banner_id=100url=/
  ___
  Nagios-users mailing list
  Nagios-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/nagios-users
  ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
  ::: Messages without supporting info will risk being sent to /dev/null
 
 
 
 
 --
 Richard Quintin, DBA
 Database  Application Administration
 Virginia Tech
 
 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's
 challenge
 Build the coolest Linux based applications with Moblin SDK  win great
 prizes
 Grand prize is a trip for two to an Open Source event anywhere in the
 world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting on 100% cpu for a period of time

2008-08-05 Thread Charles Breite
We alert on our bandwidth this way. We set the alerts to re-check(re-try 
interval) every min for 10 min. If the pipe is still full then we alert.
Hope this helps.

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Quintin
Sent: Tuesday, August 05, 2008 7:23 AM
To: Lars Jørgensen
Cc: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] Alerting on 100% cpu for a period of time

Anything is _possible_. :)

The smart folks on this list can probably suggest something better,
but one option would be to have a sar process logging cpu usage to a
file and then an NRPE check to look at the values in that file.

On Tue, Aug 5, 2008 at 4:19 AM, Lars Jørgensen [EMAIL PROTECTED] wrote:
 Hi.

 Is it possible to alert when a windows host has been running af 100% cpu for, 
 say 20 minutes?


 --
 Lars


 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
 Build the coolest Linux based applications with Moblin SDK  win great prizes
 Grand prize is a trip for two to an Open Source event anywhere in the world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




-- 
Richard Quintin, DBA
Database  Application Administration
Virginia Tech

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting on 100% cpu for a period of time

2008-08-05 Thread Paulus, Jake
Good morning!

We do this by setting the retry_interval to 2 minutes and the max_retries to 
10. This means that the service has to be in a non-OK state for 20 minutes 
straight before it enters a hard status and starts alerting.

-Jake

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lars Jørgensen
Sent: Tuesday, August 05, 2008 4:20 AM
To: 'nagios-users@lists.sourceforge.net'
Subject: [Nagios-users] Alerting on 100% cpu for a period of time

Hi.

Is it possible to alert when a windows host has been running af 100% cpu for, 
say 20 minutes?


--
Lars


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting on 100% cpu for a period of time

2008-08-05 Thread Lars Jørgensen
 We alert on our bandwidth this way. We set the alerts to
 re-check(re-try interval) every min for 10 min. If the pipe
 is still full then we alert.
 Hope this helps.

It sure does, that is both simple and elegant. And I can still do it by SNMP, I 
think.


--
Lars

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alerting on a Percentage of Threshold of a groupbeing down

2007-06-20 Thread Marc Powell


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Nick
 Sent: Wednesday, June 20, 2007 4:37 AM
 To: nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] Alerting on a Percentage of Threshold of a
 groupbeing down
 
 Hi,
 
 Was wondering if there is a way with nagios to report on a percentage
or
 below a threshold of group being unavailable?
 
 For example if i have a 100 web servers but i only want to know if
more
 than 30% of them are unreachable or if more than 30 of them are
 unreachable.

http://nagios.sourceforge.net/docs/2_0/clusters.html

--
Marc

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] alerting flakey

2007-03-07 Thread Ezra Radoff


hello. I've been using nagios for a couple of months now pretty successfully, 
but I've noticed that the alerting function is a bit flakey. I've been over the 
configuration many times, but everything seems fine. The amount of alerting it 
does seems to change after I restart the service with /etc/init.d/nagios 
restart. It was sending warning and criticals. Then, after a restart, it wasn't 
sending service critical alerts. Then I restarted it again. It wasn't sending 
anything. Then I restarted it again, and it was sending warnings. 

I'm using version 2.6 which I got from the CVS tree a couple of months ago.

Can anybody give me a little help on this one?

The alert just calls a script I wrote by hand which is referenced in the 
commands.cfg . I don't use the groups or anything.

No alert attempt is showing up in the event log either.

Thanks
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] alerting flakey

2007-03-07 Thread Jim Avery
On 07/03/07, Ezra Radoff [EMAIL PROTECTED] wrote:
 hello. I've been using nagios for a couple of months now pretty
 successfully, but I've noticed that the alerting function is a bit flakey.
 I've been over the configuration many times, but everything seems fine. The
 amount of alerting it does seems to change after I restart the service with
 /etc/init.d/nagios restart. It was sending warning and criticals. Then,
 after a restart, it wasn't sending service critical alerts. Then I restarted
 it again. It wasn't sending anything. Then I restarted it again, and it was
 sending warnings.

It's not because the hosts or services are flapping is it?

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] alerting flakey

2007-03-07 Thread Ezra Radoff
No, but I'm thinking now that it's always sending warnings and never sending 
criticals.
It's not flapping. We had a server down for hours.
It wasn't sending the warnings because it only does it after four. I think that 
part has been consistant.
In the service def it looks like all four states are configured for sending 
alerts. I don't get it.


-Original Message-
From: [EMAIL PROTECTED] on behalf of Jim Avery
Sent: Wed 3/7/2007 2:48 AM
To: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] alerting flakey
 
On 07/03/07, Ezra Radoff [EMAIL PROTECTED] wrote:
 hello. I've been using nagios for a couple of months now pretty
 successfully, but I've noticed that the alerting function is a bit flakey.
 I've been over the configuration many times, but everything seems fine. The
 amount of alerting it does seems to change after I restart the service with
 /etc/init.d/nagios restart. It was sending warning and criticals. Then,
 after a restart, it wasn't sending service critical alerts. Then I restarted
 it again. It wasn't sending anything. Then I restarted it again, and it was
 sending warnings.

It's not because the hosts or services are flapping is it?

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] alerting flakey

2007-03-07 Thread Jim Avery
On 07/03/07, Ezra Radoff [EMAIL PROTECTED] wrote:
 No, but I'm thinking now that it's always sending warnings and never sending
 criticals.
  It's not flapping. We had a server down for hours.
  It wasn't sending the warnings because it only does it after four. I think
 that part has been consistant.
  In the service def it looks like all four states are configured for sending
 alerts. I don't get it.

Whether a critical alert gets generated or not can depend on the
notification_options in the service definition, the host definition
and/or the contact definition.

Whether notifications are generated at all can depend on
notification_enabled in the host or service definition, on the
timeperiod in the contact definition, globally in the nagios
configuration and it can be dynamically enabled/disabled for hosts,
services and for nagios as a whole.

My guess is that it might be something quite simple in the
notification_options somewhere.  See
http://nagios.sourceforge.net/docs/2_0/notifications.html

Another option worth trying is check_for_orphaned_services in your
main nagios.cfg file.  See:
http://nagios.sourceforge.net/docs/2_0/configmain.html#check_for_orphaned_services

Cheers,

Jim

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] alerting flakey

2007-03-07 Thread Ezra Radoff

OK. It's definatly none of those. take a look.

define service{
use local-service
hostgroup_name  cisco_routers
service_description Cisco_load
check_command   
check_snmp_load_cisco!cisco!90,80,60!100,100,100
}

##

define service{
namelocal-service   ; The name of 
this service template
use generic-service ; Inherit 
default values from the generic-service definition
check_period24x7; The service 
can be checked at any time of the day
max_check_attempts  4   ; Re-check the 
service up to 4 times in order to determine its final (hard) state
normal_check_interval   5   ; Check the 
service every 5 minutes under normal conditions
retry_check_interval1   ; Re-check the 
service every minute until a hard state can be determined
contact_groups  admins  ; Notifications 
get sent out to everyone in the 'admins' group
notification_optionsw,u,c,r ; Send 
notifications about warning, unknown, critical, and recovery events
notification_interval   60  ; Re-notify 
about service problems every hour
notification_period 24x7; Notifications 
can be sent out at any time
register0   ; DONT REGISTER 
THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

}

##


Whether a critical alert gets generated or not can depend on the
notification_options in the service definition, the host definition
and/or the contact definition.

Whether notifications are generated at all can depend on
notification_enabled in the host or service definition, on the
timeperiod in the contact definition, globally in the nagios
configuration and it can be dynamically enabled/disabled for hosts,
services and for nagios as a whole.

My guess is that it might be something quite simple in the
notification_options somewhere.  See
http://nagios.sourceforge.net/docs/2_0/notifications.html

Another option worth trying is check_for_orphaned_services in your
main nagios.cfg file.  See:
http://nagios.sourceforge.net/docs/2_0/configmain.html#check_for_orphaned_services

Cheers,

Jim

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] alerting flakey

2007-03-07 Thread Ezra Radoff
sounds like you've seen this before?
I did as you advised. stop restart. I don't know what other processes to look 
for besides the one below. There weren't any running.

isk-nagios:/usr/local/nagios/etc # ps -ef | grep nagios
nagios   25494 1  0 14:48 ?00:00:00 /usr/local/nagios/bin/nagios -d 
/usr/local/nagios/etc/nagios.cfg
root 25501 25032  0 14:50 pts/000:00:00 grep nagios



-Original Message-
From: Santhosh Kumar A [mailto:[EMAIL PROTECTED]
Sent: Wed 3/7/2007 6:43 AM
To: Ezra Radoff; nagios-users@lists.sourceforge.net
Subject: RE: [Nagios-users] alerting flakey
 
 

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ezra
Radoff
Sent: Wednesday, March 07, 2007 1:28 PM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] alerting flakey

 

 

hello. I've been using nagios for a couple of months now pretty
successfully, but I've noticed that the alerting function is a bit
flakey. I've been over the configuration many times, but everything
seems fine. The amount of alerting it does seems to change after I
restart the service with /etc/init.d/nagios restart. It was sending
warning and criticals. Then, after a restart, it wasn't sending service
critical alerts. Then I restarted it again. It wasn't sending anything.
Then I restarted it again, and it was sending warnings.



 check whether multiple nagios daemons running or not .  

stop  nagios and ensure every nagios process is killed then do a start
(don't use restart)

Santhosh


I'm using version 2.6 which I got from the CVS tree a couple of months
ago.

Can anybody give me a little help on this one?

The alert just calls a script I wrote by hand which is referenced in the
commands.cfg . I don't use the groups or anything.

No alert attempt is showing up in the event log either.

Thanks 


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] alerting flakey

2007-03-07 Thread Jim Avery
On 07/03/07, Ezra Radoff [EMAIL PROTECTED] wrote:

  define service{
  use local-service
  hostgroup_name  cisco_routers
  service_description Cisco_load
  check_command
 check_snmp_load_cisco!cisco!90,80,60!100,100,100
  }

I can't see anything obviously wrong there. I'm not familiar with the
check_snmp_load_cisco plugin though.  It might be wise to run that
manually while logged in as nagios.  Make sure it returns with the
right exit code and returns the expected output.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null