Re: [Nagios-users] Weird Nagios Problem

2012-12-04 Thread Jeffrey Watts
Martin, I've always used NRPE to run check_load remotely.  If you use SNMP,
you can also write a custom plugin to gather the values that way.  There
might be a plugin that someone else has written, too.

Jeffrey.


On Tue, Dec 4, 2012 at 8:33 AM, Martin Hugo  wrote:

> You are right, it was using check_local_load, is there a remote version of
> this command?
>
>
--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] 2 Nagios boxes running together in different locations

2012-05-09 Thread Jeffrey Watts
This is exactly how I do things, except I have three sites.

Jeffrey

On Wednesday, May 9, 2012, C. Bensend wrote:
>
>
> I've dealt with this situation before, and I've ended up
> implementing two mostly standalone Nagios systems.  They each
> check their own site, so if their external network goes away they
> are still able to monitor and alert for the things they're
> responsible for (you have to use out-of-band notifications of
> course).  They also each check each other's *site*, ala the other
> site's firewall, so the Nagios server at site A can alert and let
> you know if site B goes away, but it *doesn't* try to alert you
> for all of the hosts and services at site B.
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_http issue

2012-03-12 Thread Jeffrey Watts
I wonder if it's an epoch thing...   Are all of the certs that are failing
ones in which the expiry year is 2038 or greater?

Jeffrey.

On Mon, Mar 12, 2012 at 4:04 AM, Sunny Jaisinghani <
sunny_jaisingh...@symantec.com> wrote:

> Hello,
>
> I am using the check_http plugin for checking the SSL cert expiry. Even if
> the cert is not due to expire very soon, the plugin reports as CRITICAL.
> I have few more certs for which the plugin reports correct status.
>
> What could be going wrong over here. ??
>
> BAD
>
> # /usr/lib/nagios/plugins/check_http --ssl -H XX.XX.XX.XX  -p 8001 -w 30
> -c 30 -C 30
> CRITICAL - Certificate expired on 02/17/2062 22:44.
>
> # openssl x509 -in abc.example.com.crt -noout –enddate
> notAfter=Feb 17 22:57:24 2062 GMT
>
> GOOD
>
> # /usr/lib/nagios/plugins/check_http --ssl -H XX.XX.XX.XX  -p 8000 -w 30
> -c 30 -C 30
> OK - Certificate will expire on 02/09/2013 23:59.
>
> # openssl x509 -in xyz.example.com.crt -noout –enddate
> notAfter=Feb 10 23:59:59 2013 GMT
>
>
--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service dependency help

2012-02-07 Thread Jeffrey Watts
Anyone have any advice on this problem?

Thanks again,
Jeffrey.

On Thu, Jan 26, 2012 at 11:56 AM, Jeffrey Watts
wrote:

> Hello, I'm having some trouble getting a service dependency working and I
> was hoping for some help.  I've read the section in Wolfgang Barth's book
> "Nagios 2nd Edition" and googled around a bit, but something's still not
> working right.  I'm using Nagios 3.0.6.
>
> Specifically, I want to set it up so that my OpenManage (thanks Trond!)
> and OMSA version checks both are dependent upon SNMP.  However, my setup is
> still sending notifications for OMSA version and OpenManage when I stop the
> SNMP daemon.  Here are the relevant snippets:
>
> define service {
>   use   generic_service_t
>   service_description   SNMP
>   max_check_attempts3
>   normal_check_interval 60
>   retry_check_interval  5
>   notification_interval 60
>   check_command check_snmp_custom!-H $HOSTADDRESS$ -C tomgeco -P
> 2c -o sysDescr.0
>   event_handler eventhandler_snmpd
>   servicegroups snmp
>   hostgroup_namedellhardware
>   contact_groupstechops
> }
>
> define servicedependency{
>   hostgroup_namedellhardware
>   service_description   SNMP
>   dependent_service_description OMSA version,OpenManage
>   inherits_parent   1
>   execution_failure_criterian
>   notification_failure_criteria u,c
> }
>
> Any help as to what I'm doing wrong?
> Jeffrey.
>
--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_snmp OID integer /10

2012-02-03 Thread Jeffrey Watts
You can always write a script that acts as a wrapper around check_snmp if
you want prettier info displayed.

Jeffrey.

2012/2/1 Sánta József 

> Hi!
>
> ** **
>
> I have a temperature SNMP device.
>
> ** **
>
> snmpget 10.0.0.63 1.3.6.1.4.1.17095.4.1.3.3.0 -c public -v 1
>
> iso.3.6.1.4.1.17095.4.1.3.3.0 = INTEGER: 40
>
> ** **
>
> But I like /10 of the integer…
>
> ** **
>
> Something like this:
>
> ** **
>
> [oid("1.3.6.1.4.1.17095.4.1.3.3.0")/10]°C
>
> ** **
>
> How to add this to the check_snmp command?
>
> ** **
>
> Thanks!
>
>
> **
>
--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Service dependency help

2012-01-26 Thread Jeffrey Watts
Hello, I'm having some trouble getting a service dependency working and I
was hoping for some help.  I've read the section in Wolfgang Barth's book
"Nagios 2nd Edition" and googled around a bit, but something's still not
working right.  I'm using Nagios 3.0.6.

Specifically, I want to set it up so that my OpenManage (thanks Trond!) and
OMSA version checks both are dependent upon SNMP.  However, my setup is
still sending notifications for OMSA version and OpenManage when I stop the
SNMP daemon.  Here are the relevant snippets:

define service {
  use   generic_service_t
  service_description   SNMP
  max_check_attempts3
  normal_check_interval 60
  retry_check_interval  5
  notification_interval 60
  check_command check_snmp_custom!-H $HOSTADDRESS$ -C tomgeco -P 2c
-o sysDescr.0
  event_handler eventhandler_snmpd
  servicegroups snmp
  hostgroup_namedellhardware
  contact_groupstechops
}

define servicedependency{
  hostgroup_namedellhardware
  service_description   SNMP
  dependent_service_description OMSA version,OpenManage
  inherits_parent   1
  execution_failure_criterian
  notification_failure_criteria u,c
}

Any help as to what I'm doing wrong?
Jeffrey.
--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] question about check_disk

2012-01-20 Thread Jeffrey Watts
You have it backwards.  'df' says 24% used, not remaining.  'check_disk'
shows how much is remaining (76% and 99% in this case).  I've never liked
how check_disk displays its results by default, most Unix tools show as a
primary metric how much of a resource is used and not how much is
remaining.  It always looks wrong.


Jeffrey.
On Fri, Jan 20, 2012 at 3:18 AM, Ensing, Harm  wrote:

>  Hi,
>
>
>
> Please help
>
> See this output:
>
>
>
> $ df -h /S00022_BACKUP
>
> Filesystem size   used  avail capacity  Mounted on
>
> S00022_BACKUP  2.6T94G   307G24%/S00022_BACKUP
>
>
>
> $ plugins/check_disk -E -w 20% -c 10% -W 10% -K 8% -u MB -p /S00022_BACKUP
>
> DISK OK - free space: /S00022_BACKUP 314182 MB (76% inode=99%);|
> /S00022_BACKUP=95885MB;328053;36906 0;0;410067
>
>
>
> $ zpool list S00022_BACKUP
>
> NAMESIZE  ALLOC   FREECAP  HEALTH  ALTROOT
>
> S00022_BACKUP  2.66T  2.32T   349G87%  ONLINE  -
>
>
>
> df says FS is 24% free, so no problem.
>
> Check_disk says 76% full, so that matches, still no problem.
>
> Assuming inode=99% reports inode table 99% full (!) I would expect a
> warning/critical from ‘-W 10% -K 8%’ but it does not do that.
>
>
>
> ‘zpool list’ shows ‘87%’ which is totally different.
>
>
>
> -Am I interpreting the info wrong?
>
> -How can I dig up more detailed info on inode usage in a zpool?
>
> -What causes the difference between ‘df –h’ and ‘check_disk’ on one
> side and ‘zpool list’ on the other side?
>
>
>
--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] disk checks unreliable

2011-10-05 Thread Jeffrey Watts
The check is working correctly - /mnt/store is a valid path in both
circumstances.  Remember, in Unix mounted filesystems all sit on top of the
/ filesystem, so when you umount the filesystem on /mnt/store, that
mountpoint still exists (on /).

The way I've done it in the past is by using -r/-R to match against the
source path.  For example, to match "//server/FooBar$" I had a check_disk
check with "-r FooBar" in it.  I imagine that you might also be able to do
what you're looking to do by using -X to exclude whatever filesystem type
the / filesystem is (assuming that the mounted filesystem is a different
type, of course).  I'm sure others will have different ways of doing it too.

Good luck.
Jeffrey.

On Wed, Oct 5, 2011 at 9:52 PM, Tim Dunphy  wrote:

> hello list!
>
>  hello.. I am running a nagios disk check that reports OK even when the
> partition is not mounted or the machine is shut down .. how can I test the
> check and adjust it so that it reports accurately?
>
>
> ## Machine info
>
> CentOS release 5.6 (Final)
> i686
>
> ##Nagios Version
>
> Nagios Core 3.3.1
>
> ## Command definition
>
> define command{
>command_namecheck_store_disk
>command_line$USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
>}
>
>
> ## Service definition
>
> define service{
>use local-service ; Name of
> service template to use
>#host_name   localhost
>hostgroup_name  web-servers
>service_description Store Partition
>check_command   check_store_disk!20%!10%!/
>}
>
> The disk is mounted:
>
> [root@VIRTCENT11:~] #df -h
> nas2:/mnt/store
>  1.4T  370G  876G  30% /mnt/store
>
> [root@VIRTCENT11:~] #/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p
> /mnt/store
> DISK OK - free space: /mnt/store 896088 MB (70% inode=99%);|
> /mnt/store=378829MB;1108624;1247202;0;1385780
>
> In this case the check is accurate...the disk is mounted
>
> Now I unmount the partition:
>
> [root@VIRTCENT11:~] #umount /mnt/store
>
> I verify that the partition is not mounted with df and then run the check
> again:
>
> [root@VIRTCENT11:~] #/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p
> /mnt/store
> DISK OK - free space: / 5737 MB (68% inode=96%);| /=2581MB;7017;7894;0;8772
>
> But the check still thinks the disk is ok.
>
> How can I best address this problem?
>
> Thank you,
> Tim
>
>
>
> --
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] [RESOLVED] Return code of 9 is out of bounds when plugin is runin nagios, but return code is 0 when run from shell

2011-09-18 Thread Jeffrey Watts
I think there was something in the FAQ about not using the built-in Perl
interpreter.  I don't, I've had nothing but trouble with it.

Jeffrey.

On Sun, Sep 18, 2011 at 8:57 PM, Samuel Kidman wrote:

> Hi All
>
> I managed to resolve this by changing the command definition. Not sure
> why that was the cause of the problem, though.
>
> Before I was just running it with $USER1$/name-of-plugin.pl
>
> The way I got it to work was
>
> /usr/bin/perl $USER1$/name-of-plugin.pl
>
> Also I had left open some of the command macros, ie typing
>
> $HOSTADDRESS
>
> Instead of
>
> $HOSTADDRESS$
>
> Hope this helps some other people with this issue.
>
> Regards, Sam
>
> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se]
> Sent: Wednesday, 14 September 2011 6:22 PM
> To: Nagios Users List
> Cc: Samuel Kidman
> Subject: Re: [Nagios-users] Return code of 9 is out of bounds when
> plugin is runin nagios, but return code is 0 when run from shell
>
> On 09/14/2011 08:16 AM, Samuel Kidman wrote:
> > The script is called from the nagios server itself, NRPE isn't
> involved.
> > It's really frustrating as I can't seem to find any source for the
> > error and I can't think of anymore troubleshooting steps or ways to
> > repeat the error outside of Nagios. Is there some way I can get more
> > detail on why this code is getting returned by using debugging options
> in nagios.cfg?
> >
>
> $ errid 9
>  9  EBADFBad file descriptor
>
> This means it's somehow related to filedescriptors. All plugins share
> the number of open filedescriptors with Nagios, so if many plugins run
> at the same time and the plugin is opening a lot of files or making a
> lot of socket connections, you might end up with it breaking from
> something like this.
>
> Does the plugin return proper output or do you get the Nagios-generated
> one?
>
> Does it always break or only sometimes?
>
> --
> Andreas Ericsson   andreas.erics...@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225  Fax: +46 8-230231
>
> Considering the successes of the wars on alcohol, poverty, drugs and
> terror, I think we should give some serious thought to declaring war on
> peace.
>
>
> --
> BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
> Learn about the latest advances in developing for the
> BlackBerry® mobile platform with sessions, labs & more.
> See new tools and technologies. Register for BlackBerry® DevCon today!
> http://p.sf.net/sfu/rim-devcon-copy1
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
--
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the 
BlackBerry® mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry® DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 ___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Scheduled downtime and host checks

2011-06-01 Thread Jeffrey Watts
On Wed, Jun 1, 2011 at 1:27 AM, Kumar, Ashish  wrote:

>
>> No, scheduled downtime only affects notifications, and the stats you
>> see in the availability cgi.  Service and host checks run as normal
>> during scheduled downtime.
>
>
> Thanks Jim for the explanation but I do not see any rational reason to
> execute host and service checks while the monitored host is scheduled for
> "fixed" downtime.
>

There are plenty of rational reasons.  Just because you disagree with the
default behavior doesn't mean it's irrational.  Many, many, many times I put
systems into scheduled, fixed downtime and still want checks to be executed.
 For example, if I know the netadmins are going to be reconfiguring
networking at one of our datacenters I will schedule fixed downtime for the
period of their maintenance for the servers/switches/routers affected.

However, I do want to see what's up and down during that time so I can tell
when they start and finish their work, and what they're affecting.  That's a
perfectly rational reason to do checks during maintenance.


> This is useful because it allows you to
>> check the stats of those hosts and services are ok before the
>> scheduled downtime period ends.
>>
>
> But if the host/services are offline after the scheduled "fixed" downtime
> period ends it will send the notifications anyway (or would it not?)
>
> I wish there was a way to disable active checks while a host has scheduled
> downtime set.
>

If the hosts and services are down after the downtime ends yes it will send
notifications, as clearly either:

1) The maintenance window wasn't long enough.
2) Someone broke something, or something died for another reason during
maintenance

Sounds like proper behavior.

As far as your question goes, you can disable active checks manually, or you
can write a script that sets downtime and disables active checks at the same
time.  You could then run it (manually or via 'at' or something else) to
re-enable active checks.  Or hack the Nagios source code and add that option
yourself.  I believe in the last week or so someone posted a sample script
for setting downtime via a script, so you might search the archives.

Jeffrey.
--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed Nagios Configuration with Passive Checks

2011-03-21 Thread Jeffrey Watts
If I understand you correctly, are you trying to have your master server
accept passive checks from other Nagios servers and check services and
servers itself?

If so, what I think would be easier (and is the method recommended by the
docs) would be to just set up a second Nagios server (or instance) to run
the checks that the master is currently doing and have the master only
accept passive checks.

The master would have enable_notifications on, execute_service_checks off.
The slaves would have enable_notifications off, execute_service_checks on.

There's more to it than just that, read the following guide for more
information:
http://nagios.sourceforge.net/docs/3_0/distributed.html

Also, if you simply want a unified interface, you might want to look at
Nagios Fusion.  I haven't used it, but it might be what you want.

Good luck,
Jeffrey.

On Sun, Mar 20, 2011 at 9:43 PM, Samuel Kidman wrote:

> Hello
>
>
>
> I am trying to set up a distributed nagios configuration that will monitor
> n mine-sites. There is a single master server that will accept passive
> checks from all of the mine sites providing a unified view of network status
> throughout the organisation.
>
>
>
> I have set up host groups that represent groups of hosts with similar
> function and have the same service checks ran against them such as switches
> and PLCs. Whenever we get a new device I can just add it to the right host
> group and it gets all of its service checks by being a member of that
> hostgroup.
>
>
>
> My question is how do I disable active checks for the service checks for
> the remote minesites without having to create a separate service check for
> each site? The only thing I’ve been able to think of is using an external
> command at nagios start up that checks all of the services on the master
> server and works out which ones belong to remote sites and then disables
> active checks on each one, but this seems like a messy way.
>
>
> If there was a way to have host group intersections in the hostgroups
> property in service checks I could specify two services and two hostgroups –
> a local and a remote one. The local check applys to its function group AND
> all devices at the head office while the remote check applies to its
> function group AND all remote hosts, however at present I think this
> functionality is unavailable.
>
>
>
> Just looking for some configuration suggestions on how to get this to work.
>
>
>
> *Sam Kidman*
>
> *IT Support Officer*
>
> --
>
> *T:*08 9225 0944
>
> * *
>
>
>
>
>
> [image: Description: panres]
>
>
> --
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring unmounted partition

2011-02-08 Thread Jeffrey Watts
IMHO /proc/mounts is a much better place to look, as NFS problems and so
forth can cause 'mount' to hang and hit the NRPE timeout.

Jeffrey.

On Tue, Feb 8, 2011 at 6:24 AM, dave stern - e-mail.pluribus.unum <
dit.d...@gmail.com> wrote:

> Write a plugin.  It could search the output of the command, "mount"
>
>
--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring temperatures on Cisco equipment

2011-01-27 Thread Jeffrey Watts
Thanks Gerald, that's exactly what I was looking for!

Jeffrey.

On Thu, Jan 27, 2011 at 3:51 AM, Ortner, Gerald wrote:

>  Hi,
>
>
>
> We use 
> check_cisco_envmonto
>  monitor our Cisco equipment. It’s using the  ciscoEnvMonState value
> only.
>
> I don’t know if the thresholds are available through snmp,  but you can
> view them by entering  “show environment alarm thresholds” on the device.
>
>
>
> gerald
>
>
>
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring temperatures on Cisco equipment

2011-01-27 Thread Jeffrey Watts
I think you misunderstand.  Those two plugins return WARNING or CRITICAL if
one of the two things occur:

1) If the ciscoEnvMonTemperatureState is not "normal".
2) If the passed -w and -c values are less
than ciscoEnvMonTemperatureStatusValue.

What I'm asking is why #2 is _required_.  I can understand it as an optional
check if you want to override the device's defaults, but not as mandatory
behavior.  Cisco devices are smart and know when they're warm or hot.
 That's the purpose of the ciscoEnvMonTemperatureState.  I'm just trying to
find out why folks feel that overriding Cisco's defaults is necessary
behavior.

Thanks,
Jeffrey.

On Thu, Jan 27, 2011 at 3:05 AM,  wrote:

>
> I maybe misunderstanding you here but isn’t the whole point of running
> Nagios checks to return Normal, Warning or Critical, so you can alert
> agents them?
> What would be the point in just returning the value and doing nothing with
> it?
>
> Regards,
> Rithcie
>
>
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE- Log files in syslog

2011-01-26 Thread Jeffrey Watts
You really ought to read up on syslog and how it works.  It can do
everything you're asking.  The whole point of the syslog service is to
abstract logging from the application - so that it can be centralized,
filtered, separated, etc.  :)

Try (depends on which syslog you have running, ps can help you figure that
out):
man 5 syslog.conf
man 5 syslog-ng.conf

That will help you with the format of the configuration file.  If you're not
familiar with the concepts of syslog in general, I'd recommend googling.

Good luck!
Jeffrey.

On Wed, Jan 26, 2011 at 11:36 AM, Archana Ramamoorthy <
archu2i...@yahoo.co.in> wrote:

> Hi All:
>
> I use the NPRE plugin for Nagios and i know that setting the value of the
> variable "debug" to 1 in the nrpe.cfg file would make the debug messages to
> be logged into the syslog facility. Is there a way for me to make it log
> into some other folder of my choice instead of it even getting stored in
> syslog? I couldn't find the exact script/file where it actually writes stuff
> into syslog. It would be great if someone could tell me where to find this
> file or how to change the option to make it log into a folder of my choice.
>
> Also, i tried to find if i could find some plugin to log into some other
> folder. I could find many plugins for NSCA log but i couldn't find anything
> for NRPE. It would be great if anybody could help me with this.
>
> Thanks.
>
> Regards,
> Archana
>
>
>
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Notification criteria

2011-01-26 Thread Jeffrey Watts
Nagios notifies on HARD states (or flapping, which is another matter).  It's
generally assumed that people don't want to be notified every time there's a
failed check - they want to be notified after it's failed a certain number
of times.

A SOFT state is a state where non-OK checks have occurred less than
max_check_attempts number of times.  In your case, it appears that
max_check_attempts is 3.  Nagios will try host or service (in your example)
3 times before going into a HARD state and notifying.

In your example there appears to be a discrepancy between the
retry_check_interval and when the checks were actually performed.

Jeffrey.

On Wed, Jan 26, 2011 at 9:32 AM,  wrote:

> Good Evening,
>
> I have a doubt with nagios notification criteria.
>
> [26-01-2011 15:40:45] SERVICE ALERT:
> hostName;serviceName;CRITICAL;HARD;3;Trying IP:443... [ KO ] :: Host no
> accessible
> [2601-2011 15:29:05] SERVICE ALERT:
> hostName;serviceName;CRITICAL;SOFT;2;Trying IP:443... [ KO ] :: Host no
> accessible
> [26-01-2011 15:16:05] SERVICE ALERT:
> hostName;serviceName;CRITICAL;SOFT;1;Trying IP4:443... [ KO ] :: Host no
> accessible
>
> The critical state began at 15:16, but I didn't receive any notification
> tll 15:40, when 'HARD' has been marked instead of 'SOFT'.
> What's the difference? What's the service/config parameter of
> 'HARD'/'SOFT'?
>
> This is my service config:
>
> retry_check_interval5
> notification_interval   30
> notification_period 24x7
> notification_optionsw,u,c,r
>
> Thanks in advance,
> --*
> Laura Vàzquez Martín*
>
>
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Monitoring temperatures on Cisco equipment

2011-01-26 Thread Jeffrey Watts
I'm looking to monitor temperature on some various Cisco equipment (2821,
3750, 4948, 6509, etc).  I've looked at the check_catalyst_temp.pl and
check_env_stats.py plugins, which both look in .1.3.6.1.4.1.9.9.13.1.  I see
and understand the basic mechanisms of these checks.

What I don't understand is why both require warning and critical thresholds.
 From what I can tell, both walk ciscoEnvMonTemperatureStatusDescr, check
ciscoEnvMonTemperatureState (and alert accordingly) AND also check
ciscoEnvMonTemperatureStatusValue
to see if it's out of the warning and critical thresholds specified on the
command line.  I understand why one would want to be able to set custom
thresholds (that would override "normal" and "warning" states).  What I
don't understand is why are they required?  Is there a reason, or is it just
an oversight?  I don't have much experience monitoring network equipment, so
I'm wondering if there's a reason for it.

On that same note, does anyone know where those thresholds are stored?  I
see in the same OID that there is ciscoEnvMonTemperatureThreshold, but
that's an absolute upper bound before a forced shutdown occurs.  I'm
assuming that the thresholds for "warning" and "critical"
ciscoEnvMonTemperatureState
must be stored somewhere else.  Does anyone know where that is?

Unless there's a good reason for requiring -w and -c, I'll probably change
one of the plugins to not require them.

Thanks in advance,
Jeffrey.
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with check_openmanage

2011-01-24 Thread Jeffrey Watts
Thanks Trond!  That seems to have fixed it.  Here's what I see now:

./check_openmanage -H pkc-search28 -C tomgeco
Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC
lost
Voltage sensor 14 [PS 2 Voltage 2] is Unknown reading

It comes up correctly now as a CRIT, too.

Thanks!
Jeffrey.

On Mon, Jan 24, 2011 at 10:55 AM, Trond Hasle Amundsen <
t.h.amund...@usit.uio.no> wrote:

> Jeffrey Watts  writes:
>
> > Hello, I'm using Mr. Amundsen's excellent check_openmanage plugin, and
> I'm
> > getting an odd error:
> >
> > $ check_openmanage -H myserver -C public
> > Power Supply 0 [AC] needs attention: Presence detected, Failure detected,
> AC
> > lost
> > Voltage sensor 14 [PS 2 Voltage 2] is
> > INTERNAL ERROR: Use of uninitialized value $reading in sprintf at
> /usr/lib/
> > nagios/plugins/check_openmanage line 3565.
> >
> > Has anyone else seen this error?  I'm running version 3.6.4.  Please let
> me
> > know what additional information is needed.
>
> Hi Jeffrey,
>
> This shouldn't happen, and I think I see where the problem is. Please
> try the version available here, and let me know if it performs any
> better:
>
>  http://folk.uio.no/trondham/software/test/
>
> Cheers,
> --
> Trond H. Amundsen 
> Center for Information Technology Services, University of Oslo
>
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Problem with check_openmanage

2011-01-24 Thread Jeffrey Watts
Hello, I'm using Mr. Amundsen's excellent check_openmanage plugin, and I'm
getting an odd error:

$ check_openmanage -H myserver -C public
Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC
lost
Voltage sensor 14 [PS 2 Voltage 2] is
INTERNAL ERROR: Use of uninitialized value $reading in sprintf at
/usr/lib/nagios/plugins/check_openmanage line 3565.

Has anyone else seen this error?  I'm running version 3.6.4.  Please let me
know what additional information is needed.

Thanks in advance,
Jeffrey.
--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null