[Nagios-users] High Availabilty with Nagios

2013-05-09 Thread Steve Shipway
Does anyone have an HA setup for Nagios that works?

I'm thinking of creating a NEB module that will link two Nagios setups, and 
replicate over all status changes, config changes, downtime, comments, etc etc 
and then set the 'standby' Nagios to be checks/notifications disabled when in 
standby mode, and enabled when in active mode.  Then put the two behind a 
failover load balancer (F5, Foundry or apache reverse proxy).

However this would be too much work if someone else has already found an 
equivalent solution.

I've looked at Merlin but it doesn't seem to do what I'm after (and the 
documentation is practically nonexistant - much the same as the NEB API 
documentation, in fact).  Mod_gearman lets me have redundant checks and 
replicate *active* checks, but not commands, downtime or passive checks.

Does anyone out there have a workable way to get an active/standby or 
active/active Nagios setup?  Would be interested in hearing all ideas...

Steve


Steve Shipway
University of Auckland ITS
UNIX Systems Design Lead
s.ship...@auckland.ac.nz<mailto:s.ship...@auckland.ac.nz>
Ph: +64 9 373 7599 ext 86487

--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and 
their applications. This 200-page book is written by three acclaimed 
leaders in the field. The early access version is available now. 
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Monitoring Cisco Ironport

2010-04-19 Thread Steve Shipway
Hello all.

Here, we have recently acquired a Cisco Ironport Email gateway appliance, and 
this makes all sorts of useful data available via an XML interface.  Therefore, 
I have created a plugin for Nagios and MRTG that can collect and threshold this 
data using the HTTPS/XML interface.

The plugin is written in Perl using LWP.  It is still in its infancy, but can 
retrieve any of the counters, gauges and rates available on the standard web 
status screen; if anyone would like to have a copy of the v0.1 or has their own 
experiences in monitoring Ironports, please let me know...

Due to impending growth of my family I may take a bit longer to reply that 
normal

Steve

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_multipath

2010-03-25 Thread Steve Shipway
> My money is on "Requiretty". Locally you have a TTY, while NRPE does
> not. The "Requiretty" setting in /etc/sudoers must be turned
> off. Comment out this line in /etc/sudoers:
> 
>   Defaultsrequiretty

I agree -- this one had me stumped for days when I was producing a different 
plugin (for checking jvm processes) before the light finally came on.  With 
RHEL5, the Requiretty setting became default on.

Steve

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Virtual Machines - define as parent or as host dependency...

2010-01-26 Thread Steve Shipway
This is the way we do it, with Parents (not host dependencies).

First we create a virtual object for the VMWare farm.  This has a status of UP 
if any of the farm servers are up (using check_summary).  This virtual 'host' 
has several services, using the v0.9 check_vmware, relating to the farm's 
alarms, storage volumes, etc.  These services have service dependencies on the 
VirtualCentre service running on the Virtual Centre host.

The Farm object has ALL of the ESX Servers as Parents.

All the VMs in the farm have the Farm object as a parent.  Some of them also 
use check_esx3 to alert on Alarms, CPU, and Memory usage within VMWare.

This might seem a bit complex if you've only the one server, but as soon as oyu 
have multiple servers in the farm, and use DRS, you have to use a farm object 
for parents/dependencies.

It might make more sense for these relationships to be host dependencies rather 
than parents i nmost cases, but we have a SAN mirrored environment to a seocnd 
ESX farm so that the VMs can be brought up ther ein the event of a complete 
farm outage, hence the use of Parents rather than dependencies.

If you have VSphere4 (ESX4.0) with a SNMP-enabled Cisco virtual switch in the 
farm, you could probably make the virtual switch the parent device rather than 
having to use a farm object.

The VMWare monitoring plugin we're using is v0.9 of check_vmware, from here: 
http://www.steveshipway.org/forum/viewtopic.php?f=28&t=1648

check_summary is available from nagiosexchange.org (as is check_esx3 which is 
the forerunner of check_vmware)

Steve


From: Andrew Davis [ncc...@gmail.com]
Sent: Tuesday, 26 January 2010 9:14 a.m.
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Virtual Machines - define as parent or as host 
dependency...

I'm trying to figure out the best way to do this, yet keep things as simple as 
possible.

Say I have a server called Saturn running VMWare. I'm monitoring this server 
with Nagios.
I also have three VM's on Saturn: Jupiter, Mars, and Pluto

I want to suppress all host and service alerts on Jupiter, Mars, & Pluto if the 
host Saturn is down (unreachable). I do NOT want to suppress host or service 
alerts from Jupiter, Mars, and Pluto if the VMWare processes (services) are 
down on Saturn. Basically, if my VM server is completely unreachable, don't 
bother me about its client VM's.

Am I better off doing this with a host dependency? Something like:


define hostdependency {
host_name   Host B
dependent_host_name Host C
notification_failure_criteria   d,u
}


Or am I better off defining Saturn as the parent of the VM's in the host 
config? Something like:


define host {
host_name   jupiter
use VMs
alias   jupiter
address 172.26.251.60
parents saturn, tpdmzsw1
}

I've successfully used the "parents" directive to define network topology, so I 
would think this would work. What might be the risks of defining both?

--


  A. Davis
  Email: ncc...@gmail.com

  "There is no limit to what a man can accomplish
   if he doesn't care who gets the credit." - Ronald Reagan

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Installing Nagios Server on a Virtual Machine

2009-11-12 Thread Steve Shipway
I would be very very wary of running Nagios on a VM (we use VMware here).  The 
reason for this is Clock Skew.

Clock Skew causes the virtual clock on the guest OS to lag behind then skip 
forward depending on the loading and sleep times of the guest.  Note that this 
will not affect 'Para-virtualised' guests as they share the hardware clock, but 
these are only possible in some Xen guests at the moment AFAIK and are not 
common.  VMWare can't do them.

On a lightly loaded physical machine your clock skew will be negligible but as 
load goes up you can get the guest clock lagging as much as 10sec or even more. 
 This can screw up latencies, scheduling, rate calculations (such as CPU use 
and net use) and so on.  In addition any monitoring of virtualised resource 
(CPU, Memory) will be completely wrong unless you obtain the values from a 
source which is aware of the virtualiasation (eg VMWare tools API or 
VirtualCentre API for vmware)

Clock skew and virtualised resource monitoring has caused too many problems in 
our tests and we now only use physical servers for Nagios (and MRTG).

I have a Nagios plugin check_vmware at 
www.steveshipway.org/forum for monitoring 
VMware virtualised resources via the API to get meaningful values - previous 
used check_esx3 but this has been superceeded by the use of the VC API in 
check_vmware

Steve


From: Juki [juki.e...@gmail.com]
Sent: Friday, 13 November 2009 12:42 a.m.
To: Nagios Users Mail-list
Subject: [Nagios-users] Installing Nagios Server on a Virtual Machine

Hello people,

I would like to know if it is advisable (or best practice) to install and run a 
Nagios monitoring server on a virtual machine (in this case, with OpenSuSE as 
the OS) with
the intention of monitoring physical hardware client machines on the same LAN.

If so, what known issues should I look out for in this case?


Thanks,
Juki
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed Monitoring Parents

2009-07-05 Thread Steve Shipway
If you want the satellites to suppress host/service checks when hosts are 
unreachable, then yes.
Otherwise, your central Nagios master will correctly suppress notifications (as 
it knows about the dependencies, and the satellites don't do notifications)

On our system, Ive defined the dependencies on the satellites as well because I 
want to suppress checks of unreachables (as with Nagios 2.x it causes horrible 
latencies when a sector drops out).  It's a bit messy though, as it requires 
host checks to be done on both master and satellite.

Steve


From: Harald Böhmecke [mailto:harald.boehme...@bertelsmann.de]
Sent: Sunday, 5 July 2009 11:54 p.m.
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Distributed Monitoring Parents






Hi all,



I currently have 1 Master Nagios Server and 4 Nagios "Satellites" which do the 
hard work.



I have defined all Parents (dependencies) on the Master Server.



Do I also need to define the Parents on the Satellites? Or will the Master 
Server (the one sending out Notifications) automatically define the Unreachable 
hosts by itself?





Regards,



Harald
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Changing check_http.c

2009-04-21 Thread Steve Shipway
> From: Jim Avery [mailto:j...@jimavery.me.uk]
> Sent: Wednesday, 22 April 2009 9:38 a.m.
> 2009/4/21 Andrew Davis :
> > So far, so good, but what I really want to see is the URL in the output.
...
> I agree, but since the plugin doesn't do that, I sometimes put the
> full url in the "notes" directive in the service definition (or even
> notes_url) so I can get to them easily.

This is what we do here as well (use the notes_url to give the tested URL).  I 
don't like embedding HTML in the plugin output anyway as it makes it more 
awkward to use in emails and SMS.

Steve

--
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Version 0.9beta of check_vmware available

2009-02-25 Thread Steve Shipway
Verison 0.9beta of check_vmware for ESX3.x and ESX3i is now available for
download from  http://www.steveshipway.org/forum/viewtopic.php?f=35
<http://www.steveshipway.org/forum/viewtopic.php?f=35&t=1648> &t=1648

 

This plug-in for MRTG and Nagios (including Nagios perfstats) monitors CPU,
Memory, Datastores and VC Alarms at Datacentre, Cluster, Server or Guest
level via the VirtualCentre Perl API.  It also has built-in NSCA support and
can send Nagios passive check results for all guests at once as it checks a
datacentre or cluster, greatly speeding up polling.  It now correctly
identifies percentage CPU usage for multi-vCPU guests, and has fully
parameterised thresholds.

 

The next things to add will be Network and DiskI/O stats.

 

Please give any feedback via the forum if you use this.  This plug-in
replaces the old check_esx and check_esx3 plugins which used to use SNMP as
ESX3i no longer has an SNMP interface.

 

Thanks for your attention,

 

Steve

 

---

Steve Shipway

UNIX Systems Administration, University of Auckland, New Zealand

+64 9 3737 599 x 86487

s.ship...@auckland.ac.nz

P Please consider the environment before printing this e-mail 

 



smime.p7s
Description: S/MIME cryptographic signature
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Windows Eventlog agent v1.9.0 released

2008-10-08 Thread Steve Shipway
An updated version of the Nagios Windows Eventlog agent has been released and 
is available on the Nagios Exchange.  This fixes a fairly large bug in the 
filter code so you are advised to upgrade.

http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F1689.html;d=1

This version contains a number of fixes, primarily a fix for the bug where 
existing filters would work incorrectly if new eventlog definitions were added. 
 The eventlogs are now identified by name, rather than by sequence number.

There is also a new status type, 'Ignore', to allow you to create filters to 
drop messages.  Now you no longer need to send NSCA alerts with invalid service 
names.

Finally, it has been compiled with 64bit and wide character support, so it 
should work better with international language messages; however I am unable to 
test this.

This version does not yet have a binary for 64bit Windows.  I am hoping that 
some members of the community who have offered to help will be able to provide 
this.

Thankyou for your time,

Steve

---
Steve Shipway
UNIX Systems Administration, University of Auckland, New Zealand
+64 9 3737 599 x 86487
[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
P Please consider the environment before printing this e-mail

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Monitoring VMware

2008-09-23 Thread Steve Shipway
I have a new plug-in I've developed, check_vmware, which uses the Perl VI API 
to communicate with the VirtuaLCentre server to obtain the information (rather 
than with the individual ESX hosts via SNMP, as the check_esx plugins did).

This still is in early stages, but we've implemented it here and it has passed 
all the tests so far.

It also has an MRTG mode so that it can be used as an MRTG plug-in.

This will currently acquire data for a ESX server, for an server farm, or for 
an entire datacentre, plus for individual guests.

It will query the VirtualCentre alarm status, the CPU and Memory usage, and the 
Ready time (on guests only).  It also tries to pull out the 'fairness' stats on 
a DRS system but this doesn't work correctly on our system, for some reason.  
It does not yet retrieve Swap statistics or Ready time averages over ESX hosts 
or farms.  I'm also planning to calculate standard deviation of CPU and memory 
usage over ESX hosts within a farm (which is an indication of how good your 
distribution of guests is over the farm) but I'm having performance issues 
doing this.

It requires Level 2 stats to be collected for the 5-min granularity in the 
virtualcentre.

It also requires the VI API Perl modules to be installed (get these from the 
VMWare site) and the dependent modules (SOAP::Lite)

If anyone would like a copy, please email me directly.  Note that it should be 
considered Beta code at best!  I'll post it to NagiosExchange once it can be 
considered stable.  Anyone attending LISA08 may be interested in seeing a brief 
demo of the sort of stats it retrieves at the MRTG BoF session on Wednesday.

Steve

---
Steve Shipway
UNIX Systems Administration, University of Auckland, New Zealand
+64 9 3737 599 x 86487
[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
P Please consider the environment before printing this e-mail

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] creating a nagios cluster

2008-06-28 Thread Steve Shipway
We have achieve this here by having a shared external storage unit, and then 
using LinuxHA with the nagios service and filesystem (on the external disk 
unit) being defined as HA services.  Works well - at the same time we also fail 
over the NSCA, SNMPtrapdaemon, and mysql database on the same external disk.  
You can email me if you'd like more details on what is required.

Steve


From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Brian A. Seklecki [EMAIL 
PROTECTED]
Sent: Sunday, 29 June 2008 9:56 a.m.
To: Assaf Flatto
Cc: Nagios User list
Subject: Re: [Nagios-users] creating a nagios cluster

On Fri, 2008-06-27 at 13:59 +, Assaf Flatto wrote:
> Hello
>
> I want to setup a nagios cluster that will be in active/passive (using
> heartbeat).
> I want to be able to disable notifications from the passive server and
> have it go to passive mode ,

No tools or projects exist yet specifically for setting up nagios
clusters, we're hoping to change that, but your configuration is pretty
generic.

~BAS


> (so when it does come up it will not take a long time to present the
> display).
> I need the failover to be able to preform these changes .
> Has anyone build such a cluster before ?
> what pitfalls have you encountered on this ?
>
> Thanks
>
> Assaf
>
>



-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] SMS notifications

2008-06-16 Thread Steve Shipway
> > On Jun 16, 2008, at 11:06 AM, Luc MAIGNAN wrote:
> >
> >> Isn't there a free way to send SMS via Nagios ?
> >
> > Most cell phone companies have an e-mail -> sms gateway service.
> >

Count yourselves lucky, here in New Zealand they charge you to use it in either 
direction.  You need to pay to allow people to email your phone, and/or you 
need to pay to allow yourself to email to SMS.

New Zealand mobile operators really screw the customer.  Prices are 
astronomical compared to Europe or the USA.

The cheapest way for us was to get a phone and connect it to a linux box making 
our own email to SMS gateway.  Of course, you mustn't tell the phone company 
you're doing this or else they want to charge you on a different (much higher) 
billing plan...

Steve

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem: Server is UP??

2008-06-16 Thread Steve Shipway
> I really need some help with this guys. My setup was working for the
> longest time and I can't think of any changes made to it that would
> cause this behavior. I get random (and quite annoying!) Nagios alerts
> that say "Problem: $SERVER is UP!" (with the actual host name) or
> sometimes it just literally says "Problem: HOST is UP". Note, this
> only happens during my off hours time. I've never seen this during the
> day.

You have probably configured your contact details to send a RECOVERY alert for 
hosts when it is out of hours.  This means you will get a 'host is UP' alert 
when the host status changes to UP.  Your message format probably prepends the 
'Problem: ' to it.

Admittedly your configuration doesn't look like this is there.  I suppose you 
should check the message header and make sure it really originated on your 
host, and check you don't have two versions of Nagios running (did you have and 
old or test instance set up that might be sending these emails?)

Steve

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_vmfs

2008-06-11 Thread Steve Shipway
Check on nagiosexchange.org

There's the check_esx2 plugin to check ESX servers and their guests.  Its 
currently being reworked to support ESX3 better.  I also have a plugin (run via 
NRPE) that checks vmem and vcpu on a linux/windows guest using the VMWare 
library API.  Also check_vmfs that checks for free space in the vmfs 
filesystems on the ESX server for free space.  This needs to be run via cron 
and sends NSCA alerts.

Steve


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Satish Kumar P
Sent: Thursday, 12 June 2008 02:27
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] check_vmfs

Hi,

Can someone guide me a link from where I can download check_vmfs Nagios plugin 
to monitor virtual guest operating systems?
Or kindly share information regarding any other plugins you might have used (if 
any) for this purpose.

Thanks in advance.

Thanks & Regards,
  Satish Kumar P
-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] send_nsca

2008-05-22 Thread Steve Shipway
> On May 22, 2008, at 12:19 PM, Nair wrote:
>
> > Can some one please help me in integrating send_nsca command with my
> > Perl script for passive monitoring system.
> >
> > Say how can we integrate plugin output without writing to any file
> > and then piping thru send_nsca.
> >
>

You can also take a look at http://search.cpan.org/dist/Nagios-NSCA/ which is 
the Nagios::NSCA perl module interface.  It's a bit ropey though (only v0.1) 
and the install script appears to be broken but it s a good place to start.

Steve

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] help required- oracle monitoring using nagios

2008-04-09 Thread Steve Shipway
Your service definition should instead say:

 

check_command
check_oracle_generic!test2!system!manage!18!12!select count(*) from
v$session where username in not null

 

and your command definition should have 

 

command_line/usr/local/nagios/libexec/check_oracle_generic -SID
'$ARG1$' -dbuser '$ARG2$' -dbpassword '$ARG3$' -w "$ARG4$" -c "$ARG5$"
-q '$ARG6$'

 

Re-read the documentation on how to define and use check commands.

 

Steve

 

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Thursday, 10 April 2008 16:28
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] help required- oracle monitoring using nagios

 

Hi,

 

I am trying to monitor an oracle server using nagios and am using a Perl
script check_oracle_generic.

 

I am new to nagios and hence have a very limited knowledge of it 

 

In my commands.cfg file I have defined a commnad as follows:

 

 

define command{

command_namecheck_oracle_generic

command_line/usr/local/nagios/libexec/check_oracle_generic
-SID $ARG1$ -dbuser $ARG2$ -dbpassword $ARG3$ -w $ARG4$ -c $ARG5$ -q
$ARG6$

}

 

 

Also in the service definition its defined as 

 

 

define service{

use local-service

host_name   IRIMS_Linux

service_descriptionoracle session

check_command   check_oracle_generic -SID test2 -dbuser
system -dbpassword manager -c 12 -w 18 -q \"select count(*) from
v$session where username in not null\"

contact_groups linuxadmin

}

 

 

However when I m checking my nagios configuration I am getting the
following error

 

Error: Service check command 'check_oracle_generic -SID test2 -dbuser
system -dbpassword manager -c 12 -w 18 -q \"select count(*) from
v$session where username in not null\"' specified in service 'oracle
session' for host 'IRIMS_Linux' not defined anywhere!

 

 

Also the file in the location   /usr/local/nagios/libexec  is an
executable file.

 

Also is there any alternate method to monitor an oracle server using
nagios.

 

 

 

Warm Regards,

Neha Sinha

i-RIMS | i-flex Remote Infrastructure Management Services | i-flex TDMS

Tel: +91 80 6659 6000 | Extn:  6441 | Mobile: +91 9986760182

Fax: +91 80  4005 

Website: 
http://www.iflexsolutions.com/iflex/solutions/itinfrastructure.aspx?mnu=
p3s4
 

 

 

 

 

 

Warm Regards,

 

Neha Sinha

 

i-RIMS | i-flex Remote Infrastructure Management Services | i-flex TDMS

 

Tel: +91 80 6659 6000 | Extn:  6441 | Mobile: +91 9986760182

 

Fax: +91 80  4005 

 

Website:
http://www.iflexsolutions.com/iflex/solutions/itinfrastructure.aspx?mnu=
p3s4

 

DISCLAIMER:
This message contains privileged and confidential information and is
intended only for an individual named. If you are not the intended
recipient, you should not disseminate, distribute, store, print, copy or
deliver this message. Please notify the sender immediately by e-mail if
you have received this e-mail by mistake and delete this e-mail from
your system. E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete or contain viruses. The sender,
therefore, does not accept liability for any errors or omissions in the
contents of this message which arise as a result of e-mail transmission.
If verification is required, please request a hard-copy version.

<>-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Steve Shipway
> I've a list of hosts, these hosts are not available for ping, but
normal
> service checks (SSH, SMTP, ...) work. Nagios reports theses hosts
beeing
> down! Ugly!

On our system, we too have a small subset of hosts which cannot be
pinged.  However, they can be SSH'ed.  So, I defined a new test,
check-host-alive-ssh which used an SSH connection rather than a ping,
and define this as the host_check_command for the hosts in question.
This allows Nagios to continue to work as expected.

Steve

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with high latencies after going distributed

2008-01-22 Thread Steve Shipway
> >>Active Service Latency:   0.000 / 7267.198 /
...
> >The only possible cause is the OCSP command slowing things
> >down somehow.
...
> But if the submit_check_result is running slowly, that would only
affect
> the service
> execution time wouldn't it?  My understanding of check latency is that
> it's the difference
> in time between when Nagios schedules a check to run versus the time
> that the check
> actually starts to execute.

If the scheduler gets behind, then the latency increases as it runs the
service checks in order of the scheduler.  It is possible that the OSCP
handler is run SERIALLY with service checks (as the host checks are done
in 1.x) and is therefore holding up service checks, just like you'd see
if you had a lot of down hosts and a long-running host check command.

> But maybe I'm misunderstanding something here.  When it comes to
working
> with Nagios, I tend to learn the most when I have the biggest problems


Don't we all :-/.  The latency effect of non-parallel host checks was a
nasty surprise to me.

> Do you do the same thing I mentioned where you define all the checks
on
> both distributed
> nodes, but disable checks on complimentary halves of those checks on
> each node?

Yes.  However, I can't always set the freshness checking because some of
our checks are every 4 hours, although most are at a sub 15min interval.
We have a complex configuration tool that builds our whole distributed
Nagios/MRTG configuration set from templates so I can't hand-hack the
config files either.

I have now set up one of our distributed nodes to batch the NSCA
messages, and will see if the latency increases overnight (so far, it
looks good).  To do this, I just changed submit_check_result to only
append to a file, then added a Nagios every-minute cronjob to cat the
contents of this file into send_nsca (actually, there are a few more
steps to ensure data integrity and checks, but that's basically it).
The upshot is that some checks may be delayed by up to a minute, and
we're dependent on cron, but the OCSP command exits very fast.

Let me know if you want a copy of the two scripts I used to achieve
this.

Steve

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with high latencies after going distributed

2008-01-22 Thread Steve Shipway
> As I'd mentioned in a previous message, I'm in the process of
converting
> from a centralized
> Nagios 2.10 setup all running on a single host to a distributed setup
> running on at least 3
> hosts (3 to start anyway).  The centralized setup has 572 hosts and
2900
> services 99.9% of which are active checks.
...
>   Active Service Latency:   0.000 / 7267.198 /
> 4241.019 sec

This isn't much help, but...

We've just done exactly the same (Nagios 2.9), and we have a comparable
size of system (actually a bit larger - 713 hosts, 5834 services).
After going distributed, we too have this insanely high latency on the
satellites.

The only possible cause is the OCSP command slowing things down somehow.
This is using the supplied send_nsca call to send the status off to the
central server...

define command {
command_namerelay
command_line$USER1$/submit_check_result "$HOSTNAME$"
"$SERVICEDESC$" "$SERVICESTATEID$" "$SERVICEOUTPUT$"
}

So it should work.  I guess things would be better if it packaged the
updates up into batches, although it cant do that normally.

I think it might be better to make the OCSP command just dump the status
to a file, and then have a cronjob every 60 seconds that reads the file
and sends the statuses off as a batch.  I will try this here, when I get
the chance.

Steve

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] graphical mapping tool

2008-01-03 Thread Steve Shipway
Here, we use weathermap -- http://www.network-weathermap.com/ -- which
can take data feeds from MRTG and Nagios as well as from Cacti.  I have
a home-grown remote editor 'weatherman' available from
http://www.steveshipway.org/software/weathermap-3.5.zip (perl/Tcl for
windows/linux/mac/etc) which is easier to use than the supplied one.
This lets us have custom designed maps with nodes that link to MRTG
graphs, Nagios status pages, or other maps, and change colour based on
Nagios status while the links change colour based on MRTG traffic flow.

 

Since I wrote the initial Nagios data plugin for Weathermap, let me know
if you try it out and have any problems obtaining it or getting it to
work.

 

Steve

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alex
Dehaini

Nagios is a wonderful tool but it's status map is not the prettiest. Any
tool out there that I can integrate with nagios to produce nice maps. I
will love something that I can use to create multiple maps that are all
connected to a general map? 




-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Linux Software Raid Plugin Recommendation?

2007-12-19 Thread Steve Shipway
> There are several plugins for this this already on nagios exchange. I
> would take a look there as well.

http://www.nagiosexchange.org/RAID_Controller.58.0.html?&tx_netnagext_pi
1[p_view]=224

This is the one we use - it supports several hardware RAID, plus
software raid on linux, solaris and AIX.

However, I'm a bit biased in my recommendation because I wrote it.

Steve

-
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Cannot get check_http to authenticate

2007-12-13 Thread Steve Shipway
This may be obvious, but have you checked for sure that you have the
correct username and password, and that this particular username is
authorised for the URL?  You'll get a 401 if you have a valid
username/password but the directory has a require directive that
excludes the user (or excludes the IP, or something else).

 

Steve

 

I'm trying to check status on a web site which requires basic
authentication.  This is not auth from the web app, but a pop-up window
from Apache.  

My command string is:

check_http -H www.somedomain.com   -v -a
user:pass

It always returns HTTP WARNING: HTTP/1.1 401 Authorization Required

I can successfully connect to another apache based site using the same
command string.



-
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How to get reboot messages

2007-11-29 Thread Steve Shipway
Here, we do this by checking the uptime of the host/device.For
switches etc, this is in the SNMP counter.  For windows hosts, it is via
check_nt and the UPTIME object.  For unix, you just create an
appropriate script to run via nrpe.

We then do a critical if uptime < 10min.  Since hosts are checked every
5min at most then even if the hosts reboots quickly, this will alert.  A
scheduled outage is OK because the scheduled downtime extends 10min
after the reboot.

 

Steve

 



I have a bunch of devices which alert me fine up/down, but I'm looking
to find how to get messages when they reboot. Solarwinds does this for
me now, but I'm trying to move off this solution, but my boss want
reboot messages as well as up/downs for the devices... 



-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Notification Problems

2007-11-21 Thread Steve Shipway
Check the definitions for your contact.  Do you have
host_notification_options d,r set in there as well as in the host
definition?  If not, then the notifications will be filtered out by your
contact definition.

 

Steve

 

notifications on Nagios 3.0a3.  I have all hosts configured for DOWN and
RECOVERY (d,r) notification.  I would like to be emailed when a host
goes down and then comes back up.  I have looked over the configuration
many times and it looks fine.  However, I only receive DOWN messages, no
UP messages.  When I look at the Nagios logs it does not appear that
Nagios even tries to send them.  Also, I have been running Nagios since
it was NetSaint... so I do know a bit about it.  I am just not sure what
I am missing. 

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] notification_interval seems to be ignored

2007-11-21 Thread Steve Shipway
> Just to wrap up this topic, I finally defeated this problem by
changing
> the output of my ambient temperature monitor to not return the actual
> temperature in the server room. This made the message static and
> unchanging and prevented repeated notifications from going out.

I'm coming into this a bit late, but are you sure you don't have
state_stalking enabled for this service?

You will get a new notification if the various notification filters are
passed, and:
1) The state has changed, or
2) The notification_interval has been reached, or
3) is_volatile is set, or
4) state_stalking is enabled, and the text output of the plugin has
changed

We use state_stalking for some services, eg SNMP error alerts (where we
want to re-notify if a new alert comes in, but not if its just the same
one again)

Steve

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios - MySql

2007-10-18 Thread Steve Shipway
> I want to integrate Nagios with MySQL and i was checking in
> nagiosexchange and found this:
>
http://www.nagiosexchange.org/Misc.36.0.html?&tx_netnagext_pi1[p_view]=4
62

Ah - this is one of mine.

We use this here, of course - it was written originally for Nagios 1.x
(which we use) but is now tweaked to work with 2.x as well.  However,
once we have finished the upgrade to 2.x, we're looking at moving to
ndoutils as well, as this is the recommended way of doing things in
Nagios 2 and later.

One benefit of this loader, though, is that it loads the current config
into the database for use by programs, and also generates a table in the
database with status time summary information, rather than events.  Once
we get ndoutils running, we'll need to generate something similar using
a separate process.

Steve

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] looking for help with NSClient

2007-10-01 Thread Steve Shipway
The old NSClient is notorious for not reporting problems correctly.

 

Usually, when this happens, I find the issue is that the TCP port has
been stolen by another application.  Exchange is frequently guilty of
this as it starts getting temporary port numbers from 1024 upwards and
quickly takes the NSclient port 1248.  Try moving nsclient to a much
higher port number, or else get it to start before any other apps on the
box.

 

You can test this by using netstat to see if port tcp/1248 is in use,
and if so, by what. 

 

netstat -a -p TCP -b -n

 

This might take a little time to run though.  Look for lines with :1248
in the first column.

 

Steve

 

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sudhir
Damle



I have installed NSClient 2.0.1.0 on Windows 2000 server. Client was
working Good till I rebooted the server. 

Now whenever I start ' Nagios Agent' service, it gets started,
application event viewer says 'NSClient is now responding to queries'
but service simply stops after that. No error in event viewer.

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Monitoring Windows CPU/Mem under VMWare

2007-09-23 Thread Steve Shipway
I have just created a couple of plugins, check_vcpu and check_vmem,
which query the vitual CPU and memory use by a guest OS under VMWare
ESX3.

 

The difference from the check_esx plugin is that these two plugins both
run in the GUEST OS.  They also work in both Linux and Windows, although
they only work with ESX3 and the newer vmware-tools.  Obviously, under
Windows, you need NRPE_NT or NC_NET in order to run the plugins.

 

Is there anyone out there who would be willing to try them out and give
me some feedback as to how well they work?  

Pick the executables up from

https://webdropoff.auckland.ac.nz/cgi-bin/pickup/945661545ec0ebe8e9cae13
7a553c127/356932

 

The linux one is better tested than the windows one.  Treat these as
alpha code: if you put them into a production environment, then you're
on your own.

 

Steve

 

---

Steve Shipway

UNIX Systems Adminstration, University of Auckland, New Zealand

+64 9 3737 599 x 86487

[EMAIL PROTECTED]

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios server inside vmware

2007-09-02 Thread Steve Shipway
Mels Said:
> Cook, Garry wrote:
> > IIRC, the solution given in that thread was 'Don't use VMware'.
> >
> > I run three different Ubuntu servers (Nagios, MRTG, and NeDi) on
VMware,
> >and have no issues whatsoever with time (or anything else). 
...
> I have Suse 10.2 and Nagios, MRTG, Netdirector in production running
on
> vmware GSX server, soon we migrate it to the vmware ESX cluster. 
...
> Conclusion: why not use VMware

We tried Nagios under VMWare, and although it works, there are a number
of significant pitfalls that made us decide against it.

Most of them come down to the inaccuracy of calculating rates when in a
VM on a moderately loaded ESX server, due to the clock tick being
irregular.  

Although NTP will keep your clock in synch at an hours-minutes level,
when you get down to a seconds level it will have some seconds
apparently longer than others.  This is not a problem in many cases, but
it IS a problem if you're calculating rates by taking a sample, waiting
10 seconds (or 30 seconds, or whatever), taking another sample, and
dividing the difference by the time interval.  The shorter your sample
interval, the more that VMWare can affect things.  This is a known issue
to VMWare and they advise against running this sort of monitoring (I'm
afraid I cant find the reference, it was buried deep in documentation I
read for the v2 of VMWare).

The other issue is that, since Nagios is your central alerting system,
it is good practice to keep it as independent of other hardware and
infrastructure as possible.  This means avoiding using virtualisation
(we don't even use the SAN, let alone VMWare).  Fewer dependencies means
less change that something can knock out your monitoring system.

So, the conclusion we came to is that you can use vmware, but if you
care about accuracy in per-second rate calculations (which you may well
care about if you're thresholding on them), then don't.

Steve

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] negative check latency with Nagios as VM?

2007-08-21 Thread Steve Shipway
> > VMWare themselves advise not to perform any monitoring which is
> > rate-based on the guest, and further say that any monitoring which
polls
> > hardware (eg network card traffic) will cause performance problems,
and
> > also that monitoring CPU and Memory on the guest is pointless and
> > misleading and should be done via the virtualcentre or ESX server
> > itself.
>
>Can you elaborate a little on VMWare's suggestions?  I'm working
> on a project to do just this, and I'd appreciate any references I
> can pass along to my working group.

OK, basically, there are three issues.

Firstly, anything running on the guest which queries hardware directly
(eg, get the network card counters) causes a 'potentially unsafe'
instruction in the guest, which is passed to the SC for authorisation
and verification.  This therefore slows things down a bit and is a much
higher performance hit than on a standalone box.  So, it's not a good
thing to do.

Secondly, any monitoring of the CPU and Memory will be meaningless,
because the virtualisation gives the guest a wrong impression of things.
You can get a guest thinking it has used 50% CPU, but in fact the ESX
Server will not give the guest more resources.  Much better to monitor
CPU usage and ReadyTime on the ESX Server.  Memory suffers from the
affects of Balloon and ESXSwap memory giving incorrect usage and swap
readings to the guest, and shared/private memory giving incorrect
readings of how much is actually used.  Again, read these at the Server
level to get meaningful data.

Finally, anything rate-based on the guest will be calculated by the
clock, and because of the virtualisation, the guest's clock does not
tick regularly.  Although the minutes will go by evenly, the seconds
won't - you'll get some longer and some shorter.  So, if you measure a
counter, wait ten seconds, measure it again, then take the difference
and divide by 10 it will not reliably give you a per-second rate.  It
will be artificially inflated or reduced depending on how busy the ESX
Server is at that time.  You can, however, retrieve a counter from a
guest and do the rate calculation on a separate server with a
non-virtual clock.

I hope this clarifies the issue...

Steve

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] negative check latency with Nagios as VM?

2007-08-20 Thread Steve Shipway
> Does the comment not to Nagios on VMWare also apply to a Distribution
> server?

Yes, if any of the checks are rate-based with the rate being calculated
by the plugin.  Doing anything on a VMWare guest which is sensitive to
the clock is not a good idea.  

So, you can check that (eg) http is up and running.  You cannot read the
network interface counter, compare it to the last time, and calculate a
throughput.  Any statuses that you pass on via NSCA should be OK as
their being a couple of seconds out will not matter.

VMWare themselves advise not to perform any monitoring which is
rate-based on the guest, and further say that any monitoring which polls
hardware (eg network card traffic) will cause performance problems, and
also that monitoring CPU and Memory on the guest is pointless and
misleading and should be done via the virtualcentre or ESX server
itself.

> I'm working with some people at Bright House Networks
> on the new version of check_esx to support ESX3.  

My mistake here, I meant *Ground Work Open Source* are working with me
on this.  A slight screwup due to doing several emails at once.  I'd
hate to not give credit where it is due.

Steve

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] negative check latency with Nagios as VM?

2007-08-19 Thread Steve Shipway
We run a lot of VMWare here, although we're running our Nagios on a
physical box for performance reasons.  I've spent a lot of time
researching how to monitor virtual hosts and the potential pitfalls...

> We're testing our Nagios 2.9 implementation on a VMWare server.  This
> box does have the VMWare tools installed and is running NTP to sync
> time.

Linux under VMWare seems to work best if you let VMWare Tools synch the
time to the ESX server (which uses NTP to synch its own time).  If you
run NTP on a virtual host, it can sometimes get confused as vmware-tools
will also adjust the time.  Similarly, a Windows guest should try to
rely on vmware-tools for the clock synch not anything else.

> The performance on this box seems a bit worse, but roughly comparable
to
> our physical box.  (Oddly enough, Nagios restart almost
instantaneously
> on the VM where it takes around 20 seconds to respond to the web
> interface on the physical box...)

If your old box was Nagios 1.x then that's the reason.  Nagios 2 is
much, much faster in the web interface because it preparses and caches
the configuration. Another possibility could be that your virtual disk
is held partly in memory cache on the ESX server, speeding up initial
access.

> at one point I saw the minimum check time at -2.00 seconds.  This
means
> this VM is so fast that it's running checks before they're even
> scheduled!  Wow!

This is because your clock is getting skewed.  VMWare is not good for
anything which is sensitive at any resolution smaller than 1min, because
the clock hops about a bit due to the virtualisation.  Particularly when
you're running ntp *and* vmware-tools it can cause weird behaviour as
they fight over who is authoritative.

> In any case, I was concerned about this.  My biggest worry with a VM
is
> that it doesn't track the time well enough.  

This is very much the case, a guest OS under VMWare will experience
weird clock behaviour.  This is why plugins like check_net, check_cpu,
and anything rate-based are pointless and actually misleading if run via
NRPE in a VM.  A plugin which queries SNMP to get a counter and then
calculates its own rate on a different (physical) server is fine, as
long as the rate calculation is not run in a VM.

> Or perhaps I'm just associating this with a VM and it's just Nagios
> itself.  Has anyone seen this before?

I've see it before in checks run under VMWare.  If you want to check CPU
usage under VMWare, I'm working with some people at Bright House
Networks on the new version of check_esx to support ESX3.  The old
version works with ESX2.

In brief -
* Don't run NTPD and vmware-tools together
* Don't run check_cpu, check_net or check_memory for a guest
* Don't run any rate-based checks on a virtual machine
* Don't run Nagios under VMWare if you can avoid it

Hope this helps,

Steve

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Looking for Nagios experience exchange

2007-08-19 Thread Steve Shipway
Hello everyone -

 

We're using Nagios here at the university, with about 5000 Services
being monitored over about 700 hosts/devices in a HA Nagios 1 setup.
We're just about to move to a distributed architecture, and Nagios 2.9.

 

Our senior management is interested in my doing some research into how
other sites use Nagios.  Since this means the potential for free world
travel, I can hardly refuse... what I'm hoping to find is another site
somewhere in the world, with a comparable size to ours or larger, who
are willing for me to visit and exchange our knowledge and experience on
setting up and using Nagios.  In addition, anyone who would like to
visit our offices in Auckland, New Zealand, is welcome to make a date to
come over and see how we do things here.

 

If anyone out there would be willing to host a visiting Brit/Kiwi for a
few days and talk monitoring systems, then please email me.  Similarly
for anyone who'd like to drop by and talk Nagios.  I'll even try and
revive my previous idea of a New Zealand Nagios Users group, now I have
management funding :-)

 

Steve

 

---

Steve Shipway

UNIX Systems Adminstration, University of Auckland, New Zealand

+64 9 3737 599 x 86487

[EMAIL PROTECTED]

 

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Suggestions needed for VMWare Guest OS'es

2007-07-17 Thread Steve Shipway
> As we are currently consolidating hosts to VMWare i need some way to
> define reasonable parent/child relations ships in Nagios as well as
> defining host dependencies. 

This is a really awkward one.

What we do here is to first define the guests as individual hosts for
monitoring, although we don't monitor CPU or Memory on them as the
standard plugins don't give meaningful data under vmware.

Next we're using the check_esx2 plugin on the servers in the farm which
will alert for excessing virtual CPU, Ready, and Memory usage.

Finally, we run a script regularly (daily) which probes the vmware
server and reconfigured the parent/child relationships for the vhosts vs
the esx servers.  Not ideal - particularly when you have automigration
enabled - but better than nothing. 

Now we're going to ESX3 I'm considering defining a dummy object in
Nagios for the whole ESX farm.  The guests become children of the ESX
Farm object, and the ESX farm object is a child of *all* the ESX
servers.  The ESX farm host check is defined using check_summary over
the ESX servers, and contains another check_summary check for its only
service, although it could also hold the SNMP traps from the
VirtualCentre server (we have this as a separate Nagios object).  This
setup will make more sense under ESX3, there the VC can automatically
restart a guest on a new ESX server if the ESX server currently hosting
it goes down.

Steve

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] using NDOUtils with Nagios

2007-07-12 Thread Steve Shipway
As you heard, NDOUtils is only for sending status data from
Nagios->Database.

 

However, if performance is your issue, then since Nagios 2 the daemon
has written a cache file containing pre-processed configuration data to
a temporary file as it runs. The CGIs can read this for massive
performance improvements - I've already made my downtime_sched utility
use this file if available and things are much better.  As far as Nagios
itself goes, once it has read in the config file, then it holds the
config in memory so theres no performance gain regardless.

 

Steve

 

I was trying to install  NDOUtils to mainly store Nagios 3.0
configuration information in the database and have Nagios pull config.
info from it and use it instead of from the configuration files(The idea
was to improve performance from what I'm told). However, It seems like
You can only read info from the Nagios daemon using the NDOUTILS
components but not the other way. 2 things, can this be done and if yes
how would you go about it ? This is on a Linux machine, and for Nagios
3.0, latest NDOUtils. 



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] best place to put NSClient++ on a Windows server?

2007-07-04 Thread Steve Shipway
When you run the /install option, this configures a Windows service
which points at the nsclient++.exe executable.  After you run this, you
cannot move the .exe file, else the service will not be able to start!

 

Here, I have made a windows install package, which installs the stuff
into C:\program files\monitoring and then automatically calls the
/install option to register the service.  The install pack also installs
the other monitoring agents where required (eg, the nageventlog agent).

 

Steve

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Rogelio
Bastardo
Sent: Tuesday, 3 July 2007 05:23
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] best place to put NSClient++ on a Windows
server?

 

I recently downloaded the NSClient++ plugin so that I could monitor my
NT servers using Nagios.

 

In the instructions ( http://trac.nakednuns.org/nscp/wiki/Documentation
 ), it says to
install it via the command line ("nsclient++.exe /install").  I do that,
it says that it's finished, so I'm assuming that the NSC.ini file I have
to edit is the one I have in my c:\temp\ dir. 

 

Any suggestions on where the best place to put this is on a Windows
server?

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nrpe configuration on solaris.

2007-07-02 Thread Steve Shipway
1. On solaris, make sure you have the SSL libraries installed if you are
going to use SSL.  Solaris does not seem to have these by default.

2. If you disable SSL, you need to do it on the client as well.  So,
your inetd definition uses -n to disable SSL, and you must also give the
-n option to the check_nrpe to disable SSL there as well.

 

Steve

 

If I use -t with check_nrpe and increase the time it gives me-

Could not complete ssl handshake.

 

I am sunning nrpe under inetd and have disabled ssl.

 


Notice: This email message, together with any attachments, may contain
information of BEA Systems, Inc., its subsidiaries and affiliated
entities, that may be confidential, proprietary, copyrighted and/or
legally privileged, and is intended solely for the use of the individual
or entity named in this message. If you are not the intended recipient,
and have received this message in error, please immediately return this
by email and then delete it.
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed setups

2007-07-02 Thread Steve Shipway
> We're currently looking at creating a distributed setup using
> NSCA. One thing that I've found no mention of is how the host and
> service commands are forwarded.

I think they are not.

> Even if the central machien does all the notifications (as we're
> planning), completely dis/enabling service/host checks would have
> to be distributed from the central machine to the checking
> machines.

This is the biggest disadvantage of the distributed model, in my opinion
- enable/disable checks commands are not propagated (and indeed cannot
be without some serious reworking of the cmd.cgi interface) and so you
cannot stop checking any more.

However this is not such a big issue, as mostly you are more interested
in scheduling downtime and disabling/acknowledging alerts, all of which
are done on the central server.  The satellite servers (collectors) do
not do notifications, only pass the status on to the central server
(aggregator) via the OCSP command.

If I was obsessive about it, I'd modify the cmd.cgi script so that it
spots a distributed service (no active checks, only freshness checks to
set to 'unknown') and forwards to call to the host managing it (which
I'd have to store the definition of in a separate database table or
something).  Too much trouble though, particularly since our users are
forever clicking 'disable checks' when they actually mean 'disable
alerts' or 'acknowledge'.

Steve

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Recurring Downtime

2007-07-01 Thread Steve Shipway
I think this one is one of mine.

 

We're in the process of moving to Nagios 2.x here, and in doing so I
fixed a couple of bugs in this script, and improved it considerably.
Until now I've not been able to properly test under Nagios 2.x.

 

I have just uploaded the v2.0beta1 of these scripts to Nagiosexchange
(URL below).  

 

Support on http://www.steveshipway.org/forum

 

As to your other question - you could always define a special timeperiod
in Nagios and set checks/notifications to be limited to this, although
it is a bit different from scheduling downtime.

 

Steve

 

 

New Nagios user here (2.9).  I have a server that restarts every day.  I
want to schedule downtime for a server every day at the same time.  I
know that I can't do it with the basic nagios package.  I found an
add-on on nagios exchange.
(http://www.nagiosexchange.org/Downtimes.38.0.html?&tx_netnagext_pi1%5Bp
_view%5D=363) The new version of the add-on is supposed to work with
2.x.  Does anyone here use it and have any suggestions for me?  Is there
another way to accomplish what I want to do?  

 

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Double monitoring.

2007-07-01 Thread Steve Shipway
> I'm admin of network with nagios, the network has like 30 servers, and
im
> trying to do a double monitoring.ie:

Sounds like what you want is either High Availaibility or Standy Nagios.

We use both.

For our 'live' Nagios, we have two servers running Linux-HA which are
both connected to the same external SCSI disk unit via an Adaptec
ServeRAID card.  Linux-HA takes care of failing over the control of the
disk unit, and the virtual IP, apache, and Nagios services.  This way,
we can shut down server A and server B will take over everything within
a few seconds.  So, we get 100% uptime, provided our main datacentre is
online.

However, for DR, we have another Nagios server at a remote site.  This
server has a mirror of the configuration files but Nagios is normally
down.  A regular cron job attempts to check Nagios on the live server
(via NRPE and check_nagios) and if Nagios is down on the live server,
then the DR Nagios is started up.  We could do this alternatively by
having Nagios always running, but with service checks and notifications
disabled.  Then, when the check fails, use the command pipe to send in
the enable commands (and disable commands when it comes back).  In this
case, you could even use a custom host check and event handler to take
care of it all internally to Nagios, although this would prevent you
from simply using a straight copy of the live configuration files.

I think there is some documentation on this at the Nagios web site.

Steve

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Context-sensitive help pages?

2007-06-12 Thread Steve Shipway
Has anyone produced help text pages to go into the /Nagios/contexthelp
path?  The distributed Nagios 2.9 only has a bunch of placeholders here,
and it would be helpful for our users to have some proper helptext.  If
anyone has made some, can they be uploaded to NagiosExchange.org for
everyone to use?

 

Alternatively, if Ethan has made some, where can I obtain them?

 

Thanks for any help,

 

Steve

 

---

Steve Shipway

UNIX Systems Adminstration, University of Auckland, New Zealand

+64 9 3737 599 x 86487

[EMAIL PROTECTED]

 

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring Drupal CMS with Nagios and WebInject

2007-05-28 Thread Steve Shipway
> Does anybody using Nagios for monitoring Drupal CMS deployments?
Nagios
> already has check_http and check_curl plugins but I can only validate
> index.php. I was thinking about test tool called WebInject. It has a

We use webinject to monitor our CMS system, although it is a different
CMS to yours.  It works OK, although it is a bit arcane and awkward to
configure - SiteScope does the job much better (and can easily be
configured to feed into Nagios via SNCA), but costs huge amounts of $.
You need to spend some time configuring the scripts up but after doing
so it works pretty reliably.

Steve

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Show disk usage in Trends/Graphs

2007-05-16 Thread Steve Shipway

Max wrote:
> Palle Jensen wrote:
> > You say that you are reading from SNMP, are you still using check_nt
> plugin in Nagios, or are you using any different plugin?
...
> I think you're confused about my approach. The way I do graphing,
> doesn't involve Nagios at all. I just use SNMP and MRTG which does the
> talking, then I link URLs into Nagios. Nagios doesn't actually
process,
> create, or have anything to do with the graphs. The reason I mentioned
> my approach is being I think it's easy to setup and use. Everyone has
a
> different way, though.

We use a similar setup, with MRTG for graphing and Nagios for alerting.
MRTG can also retrieve data for graphing from the Nagios NSClient and
NRPE agents (using the mrtg-pnsclient and mrtg-nrpe plugins) as well as
via SNMP.  

You can set up Nagios to associate a URL with a host or service, which
can point at the MRTG graphs.  Similarly, if using MRTG with routers2 as
the frontend, there is a Nagios plugin to allow Nagios status to be
embedded in the MRTG frames.

There are also plugins for Weathermap to allow it to read from both
Nagios and MRTG (and Cacti, for which it was originally written).

I don't use Cacti here because MRTG is simpler to set up, and is a good
match for our requirements.

Steve

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] SMTP/IMAP/POP3 checks

2007-03-18 Thread Steve Shipway
> I'm planning to start to use Nagios to monitor some email servers.
> The idea is having Nagios sending an email using check_smtp and then
> retrive it using check_tcp.
> 
> Can those two plugins accomplish this?

No, these two plugins only check for sanity on the ports and do not
actually send/receive email.

The way we do it here is to check all sorts of things. 

First, we use check_imap/check_smtp/etc to check that the ports are
listening correctly.

Then, we have checks for mail queue lengths on the mail servers (make
sure they aren't too large as this indicates something is
constipated...)

Finally, I have a 5-min cronjob that injects an email into the mail
system, addressed to a pseudouser on the Nagios host.  This pseudouser
is in fact a program which parses the email, identifies how long it took
to arrive, and submits a passive service check into nagios with a status
based on the delivery time.  A service freshness check will set the
service status to critical if no email is received for 30 mins (the
critical threshold for mail delivery here).  This is most likely the
sort of thing you're after.  The check_mail_loop plugin does similar to
this but for people with a single POP mailbox - our system is bigger and
so this was not appropriate for us.  You sould make sure you are
injecting your email into the system at a suitable place so that it does
not bypass the majority of the normal mail flow, though.

Steve

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Trap reset script

2007-03-13 Thread Steve Shipway
> It will look up the status and command files from the Nagios config,
parse
> the status and then force an active check for services matching these
> criterias:
> 1. Should not be scheduled to be checked
> 2. Has active checks enabled

Is there a reason why you are not using the Nagios freshness checking to
achieve this?

We have a similar system, with SNMP alerts coming into Nagios, and use
the Freshness Check to set the status to OK after a set period of time.
Services are set to volatile, max_checks=1, so they notify immediately
on receiving a trap no matter what the current state.

Steve

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] check_rbl update: NJABL change

2007-03-08 Thread Steve Shipway
If anyone out there is using my check_rbl plugin (available from
NagiosExchange) to check for their mail servers being listed on any
blacklists, then please note that the NJABL blacklist has been
superseded by the Spamhaus PBL.  Now, your config section at the
beginning should be:

 

my( @BLACKLISTS ) = (

# DNS blackhole domain, name, optional website address

[ "dnsbl.sorbs.net",   "SORBS", "http://www.sorbs.net/";
],

[ "list.dsbl.org", "Distributed Sender",
"http://dsbl.org/"; ],

[ "zen.spamhaus.org",  "Spamhaus SBL/XBL/PBL",
"http://www.spamhaus.org/"; ],

[ "fuldom.rfc-ignorant.org",   "RFC-Ignorant",
"http://www.rfc-ignorant.org/"; ],

[ "bl.spamcop.net","SpamCop",
"http://www.spamcop.net/"; ],

    [ "blackholes.mail-abuse.org", "Mail-abuse.org",
"http://www.mail-abuse.org/"; ]

);

 

Thanks for your time,

 

Steve

 

---

Steve Shipway

UNIX Systems Adminstration, University of Auckland, New Zealand

+64 9 3737 599 x 86487

[EMAIL PROTECTED]

 

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nsclient++ and nrpe commands

2007-02-28 Thread Steve Shipway
I believe that the 'Illegal metacharacter' error message is generated
because your command contains a '>' character, which could potentially
be someone trying to use shell redirect to break in to your machine.
 
You will definitely need to escape or quote the > to prevent the shell
interpreting it.
 
I don't know if this error is being produced by check_nrpe, by
nsclient++, or by Nagios itself.  Certainly on our system, Nagios 1.4
does not check commands for illegal characters, and check_nrpe 2.5.2
does not either, so my guess is that nsclient++ is doing it.  Make sure
you have your parameters quoted and check nsclient++ for definitions of
illegal characters...
 
Steve

--
Steve Shipway
ITSS, University of Auckland
(09) 3737 599 x 86487
[EMAIL PROTECTED]



 




From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mies,
Christian
Sent: Thursday, 1 March 2007 10:36 a.m.
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] nsclient++ and nrpe commands


Hi,
I'm trying to use nsclient++ on Windows Machines. Everything is
working fine, instead of check_nrpe with checkEventLog :-(
If I type:
./check_nrpe -H 172.16.101.13 -p 5666 -c checkeventlog -a
file=System filter=in filter=all truncate=512 MaxWarn=1 MaxCrit=1
filter-eventType==info filter-generated=>10m
 
on my shell, I get no output. If I try ./check_nrpe -H
172.16.101.13 -p 5666 -c checkeventlog -a file=System filter=in
filter=all truncate=512 MaxWarn=1 MaxCrit=1 filter-eventType==info
filter-generated=\>10m I'll get a error :Illegal Meta Character. Any
ideas to this?
Who is using nsclient++ with Eventlog Functionallity?
 
regards

Christian Mies 

 

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] What to Monitor

2007-02-18 Thread Steve Shipway

> So far I have monitors running on the host's responding to pings, to 
> load average, and to the mail queue.  Anything else anyone 
> can suggest 
> that I monitor?  I've only got three or four mailservers, so I don't 
> mind going a smidgen overboard on their monitoring...

We also monitor -

CPU, Memory, filesystems, load average, swap activity, mail queues
(active and deferred), mail throughput in messages/min (passed, spam,
virus) plus a detailed analysis of the virus types and spam rules,
greylisting activity and database size, tcp ports for each mail input
and relay stage, SMTPS SSL certificate expiry, active users, active
connections, time taken for a test email to pass through the system,
number of mail processes (postfix, amavis, gld), virus scanner daemons,
mail cluster activity, performance, and failover.

Plus most of this is also graphed where possible.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Request new functionality: "Off Hours" state.

2007-02-15 Thread Steve Shipway
> > What we need / would like is a per-service and per-host 
> configuration
> > option that allows a host or service to enter an "Off 
> hours" state in
> > the CGI displays. (Or perhaps there should also be a global 
> option for this?)

The way we do this is via a small modification to the status.cgi script,
so that there is an extra filter option which only displays things
currently within their alerting window (its an extra flag in the
bitfield options to serviceprops and hostprops).  So, with this flag
set, the screen will only show items which are currently able to send
out alerts.  

Works well for us - I can send a copy of the modified status.c if you
want, but we're using nagios 1.4...

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] NsClient 2.0.1

2007-01-16 Thread Steve Shipway
My only guess, then, is that you have different encryption/encoding
settings between the two.  Strange that this would only affect the CPU
check, though.  Since I've never encountered any problems with nsclient
I've never needed to look into debugging...
 
My only remeaining suggestion is to try installing nsclient++ or nc_net
and see if they work, or if it still fails then the problem lies with
your check_nt call.  Double check that you are really passing meaningful
parameters to check_nt.
 
Steve

--
Steve Shipway
ITSS, University of Auckland
(09) 3737 599 x 86487
[EMAIL PROTECTED]



 

I did not modified the password.

I am checking correctly (on the same host) memory and disk
spaces.

 

On other hosts I can check correctly also the CPULOAD.

 

Marco

 







The client is sending None&2&5 which is correct for a 5-min CPU
average request with the default password.

 

If you have modified the password on the pnsclient agent then
this may be the cause (this is done in the registry).  Alternatively
check your check_nt program as it may be more recent and not compatible?
We use nagios-plugins 1.4 and this works with pnsclient2.0.1 correctly.

 

Steve

 



I installed NsClient (2.0.1) on a Windows 2000 server.
It is working fine, but if I want to check CPULOAD I will read following
error message:

 

NSClient - ERROR:Malformed request or internal error.
Check EventLog:None&2&5

 

Nagios server is a 1.4 version

 

Have you got any idea?

 

Thanks !

Marco

 

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NsClient 2.0.1

2007-01-15 Thread Steve Shipway
The client is sending None&2&5 which is correct for a 5-min CPU average
request with the default password.
 
If you have modified the password on the pnsclient agent then this may
be the cause (this is done in the registry).  Alternatively check your
check_nt program as it may be more recent and not compatible?  We use
nagios-plugins 1.4 and this works with pnsclient2.0.1 correctly.
 
Steve
 

--
Steve Shipway
ITSS, University of Auckland
(09) 3737 599 x 86487
[EMAIL PROTECTED]



 




From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Marco
Borsani
Sent: Tuesday, 16 January 2007 3:39 a.m.
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] NsClient 2.0.1



Hi all .

 

I installed NsClient (2.0.1) on a Windows 2000 server. It is
working fine, but if I want to check CPULOAD I will read following error
message:

 

NSClient - ERROR:Malformed request or internal error. Check
EventLog:None&2&5

 

Nagios server is a 1.4 version

 

Have you got any idea?

 

Thanks !

Marco

 

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios and Graphs

2007-01-04 Thread Steve Shipway
> I am looking for some kind of graphing software to show ping times,

Somkeping is always good if you need a lot of ping time graphs.  Looks
very pretty, lots of detail.  It means a separate web interface of
course.

If you're already using MRTG for other graphing, then mrtg-ping-probe
can provide data to MRTG which can subsequently be displayed in many
forms, including as a floating bar if you're using routers2 as the MRTG
frontend (but thats outside the scope of this list).

I wouldn't suggest doing it via nagios (eg with Nagiosgraph) because a
ping is more than just one scalar, but that is a possibility as well.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] URGENT REPOST: CHECK_NRPE: Received 0 bytes fromdaemon ErrorMessage

2006-12-19 Thread Steve Shipway
I would say that you need to use inetd, OR an init.d script, but not
both.  If you try to then you'll get errors when they try and start up.
 
Your 'unable to read output' error seems to indicate that you have a
timeout.  I don't know what plugin you are attempting to run remotely,
but check to make sure that the timeout on check_nrpe (A), on the
nrpe.cfg file (B) and for the remotely run plugin (C) are set as C-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] URGENT REPOST: CHECK_NRPE: Received 0 bytes fromdaemon ErrorMessage

2006-12-18 Thread Steve Shipway
Sounds like the daemon is comparing the client's IP against the list of
permitted connections, and not getting a match.  For some reason the
query of the source IP on the connection is returning 0.0.0.0 instead of
the source IP - maybe you have some special wrapper or intervening
agent.  I'd suggest running under xinetd or something similar which can
do the filtering instead.
 
Steve
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_nt and MSSQL$SQLEXPRESS

2006-11-30 Thread Steve Shipway
> define command{
> command_namecheck_nt_services
> command_line$USER1$/check_nt -H $HOSTADDRESS$ -v
> SERVICESTATE -l $ARG1$
> }

Put single quotes around the argument, thus:

  command_line$USER1$/check_nt -H $HOSTADDRESS$ 
  -v SERVICESTATE -l '$ARG1$'

Nagios will still expand $ARG1$, but the shell will not subsequently try
to expand the $ in the argument.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios cluster

2006-11-21 Thread Steve Shipway
> > Unless, of course, there exists an external command to change the
> global
> > Notifications Enabled flag - but I don't think there is 
...
> Yep --
> http://www.nagios.org/developerinfo/externalcommands/commandin
> fo.php?command_id=8

I stand corrected.  I hadn't spotted this one...  thanks for the tip.

In this case, I agree that it is better to make the change via the
external command rather than by modifying nagios.cfg and restarting.

Anyway, regardless of the method used to achieve it, I think this is the
way to achieve what the original poster wanted (use a special
eventhandler monitoring the live Nagios to enable/disable notifications
on the standby Nagios)

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] timeouts when using secondary dns

2006-11-09 Thread Steve Shipway



We dealt with this by installing a local 
caching-only nameserver on the Nagios host itself.  This also took a lot of 
the load off of the main nameservers.   So, resolv.conf was set to use 
127.0.0.1 by default and have our normal name servers as secondaries.  A 
nice sideeffect was that it vastly sped up the name 
resolution.
 
Steve
 
--Steve ShipwayITSS, University of Auckland(09) 3737 
599 x 86487[EMAIL PROTECTED]
 

  
  
  From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
  stuckySent: Friday, 10 November 2006 6:57 a.m.To: 
  AzCc: nagiosSubject: Re: [Nagios-users] timeouts when 
  using secondary dns
  Yey !! That totally did it. Thx AZ I hadn't even considered messing 
  with the resolver cuz I was sure it was a nagios issue so I had to fix 
  nagios.If that wasn't a text book example of how well mailinglists can 
  work then I don't know what is... thx
  On 11/7/06, Az 
  <[EMAIL PROTECTED]> wrote:
  stucky 
wrote:> I use the check_by_ssh plugin for most of my stuff and I 
noticed that> if the primary nameserver is unavailable nagios starts 
freaking out.> All of a sudden all plugins time out. I tested it 
using the 'host' > command and it only takes about 1 second longer to 
lookup hosts using> the secondary nameserver.> The default 
timeout for check_by_ssh is 10 seconds. I cranked it up to> 30 and 
still I get timeouts. I'm not sure I understand that one. > Has 
anyone else seen this.We had a similar issue in that our primary DNS was 
doing strange things,and it quite often took 5 or even 10 seconds to 
perform a DNS lookup.What we were seeing was 70% of service checks (and 
subsequently host checks) failing by timing out. The key was the 
multiple of 5 seconds.The resolver timeout on, say, RHEL3 is based on 
RES_TIMEOUT inresolv.h... which was 5 seconds.We added the 
following to our resolv.conf , and found the problems went 
away:options timeout:2 rotateThis 
sets the timeout for waiting for a reply to 2 seconds, and tellsthe 
resolve to rotate through your 'nameserver' entries rather thanalways 
hitting #1, then #2, 
  etc.Cheers.-- stucky 
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Esx checks

2006-11-08 Thread Steve Shipway

Derek Balling [mailto:[EMAIL PROTECTED] wrote:
> On Nov 7, 2006, at 4:45 PM, Steve Shipway wrote:
> > As far as I know, there is no way to find out the IP 
> address of an ESX
> > guest OS without connecting to it (eg by logging in to the virtual
... 
> That can't be true. VirtualCenter routinely tells me what IP address  
> a guest VM is using.
> 
> Maybe there's some VC nonsense that happens, but it seems like you  
> might be able to get at that info using the ESX Perl API or something?

OK, looking in there, Virtual Centre does indeed state the IP address of
the guest.
How on earth does it find that out??  I assume it must be getting sent
from the vmware tools daemon running on the guest (which implies that
vmware tools must be running to know it).  However, I cannot find this
information under the /proc/vmware tree on the ESX server (we have v2.x,
it may be different on 3.x) so it's not clear where it is kept and if it
is accessible to us.

On a separate note, while Easter-egging for where it stores the IP
address info, I've found where it keeps the per-VM UserRPC counts (a
good indicator of if your VM is using too many sys calls making it a bad
virtualisation candidate) so I can incorporate this into check_esx v2.6.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Esx checks

2006-11-07 Thread Steve Shipway
> Whenever I am getting info from an ESX server 
> I can use this info to get the following information:
> 
> VHosts: 10/10 up: 
>   w2kbi70(up), 
... 
> Is there some way to get snmp info from those machines so
> I can parse a list of hosts to some other services like the following?

As far as I know, there is no way to find out the IP address of an ESX
guest OS without connecting to it (eg by logging in to the virtual
console), and therefore no way to SNMP query it.  The ESX server does
not actually know the guest's IP address(es), only the virtual switch to
which it has virtual network ports mapped.

The check_esx2 plugin (you seem to not have the latest version, by the
way) just uses SNMP to the ESX service console to check on guests and
identifies the vhosts by the ESX internal VMID.

What I do here is use a homegrown program that does matching between
known hostnames and the VMWare guest name, and (because we usually
follow a strict naming standard) I can identify matches.  I still need
to know the hostname, though. 

I'd suggest you make sure that the first token in the ESX Guest
description is the hosts FQDN (possibly with your site's domain cut off
for brevity) and then you'll be able to use DNS to indentify it and SNMP
query it directly.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] No data was received from host!

2006-11-05 Thread Steve Shipway



This nicely explains why my mrtg-pnsclient script 
(which allows MRTG to directly query pnsclient, NC_Net etc) has occasional 
problems when querying two variables at once, and the second query fails.  
It is interesting to note that the original pnsclient doesn't have this 
limitation :)
 
If you need to do several checks on eventlogs, you 
might be better off using the NagEventLog eventlog agent instead (available from 
nagiosexchange.org) to parse the logs and send passive alerts to 
Nagios.
 
Steve
--Steve ShipwayITSS, University of Auckland(09) 3737 
599 x 86487[EMAIL PROTECTED]
 

  
  
  From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
  Anthony MontibelloSent: Monday, 6 November 2006 1:03 
  p.m.To: nagios-users@lists.sourceforge.netSubject: Re: 
  [Nagios-users] No data was received from host!
  
  NC_NEt ver 2.28 can only preform one check at a time.  
  thus if some of these checks take a long time to process, Like the 
  inplementation of eventlog checks in that version of NC_NEt then the checks 
  may get backed up.  Thus causing several checks to get the No 
  Data Recieved form host. 
   
  THe source of this may be due the Windows host, being very bogged down. 
  
  A second source may be one of the checks periodically taking longer to 
  preform.
   
  A posible solution may be to increase the timout in check_nt. 
  command.  this may allow it to check up quicker when the backup does 
  occure.  
   
  A second solution is to setup some of the checks to be preformed as 
  passive checks using NSCA.  this will eliminate that issue 
  altogether.  
   
  Good Luck,
  TOny (creator of  NC_NEt)
   
   
   
  On 11/2/06, Folkers, 
  Lynn <[EMAIL PROTECTED]> 
  wrote: 
  

I am running Nagios 2.0 on Redhat EL3.  
I am running NC_NET version 2.28 on the Windows clients.  I monitor 222 
systems with 1324 service checks.  The problem I am having is that on 
about 12 Windows clients I get the following error during checks "WARNING 
11-02-2006 10:28:31 9d 6h 44m 37s 5/5 No data was received from host! 
".  This does eventually clear up but continually comes back. This does 
not happen on all the Windows clients. They are all running the same NC_NET 
2.28 version.  Has anyone else seen this?
Thanks. 
 
-Using 
Tomcat but need to do more? Need to support web services, security?Get 
stuff done quickly with pre-integrated technology to make your job easier 
Download IBM WebSphere Application Server v.1.0.1 based on Apache 
Geronimohttp://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___Nagios-users 
mailing listNagios-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users 
::: Please include Nagios version, plugin version (-v) and OS when 
reporting any issue.::: Messages without supporting info will risk being 
sent to /dev/null
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Advanced permissions/user properties

2006-11-05 Thread Steve Shipway
Alex Burger wrote:
> Leave the groups as they are, but modify the host and service 
> contact_groups command?  For example:
>  define host{
>   host_name   localhost
>   contact_groups  netops:rw, helpdesk:r
> }
> 
> For backwards compatibility, if no permissions are set, the defaults 
> would be rw so the following would be the same:
> 
> define host{
>   host_name   localhost
>   contact_groups  netops, helpdesk:r
> }
> 
> If a user was in both the netops and helpdesk group, the user should 
> have rw access.

This is exactly what we need, and the best way I have seen (so far) to
implement it.  In particular it has backwards compatibility and  does
not involve an additional directive.  Much simpler to do it via groups
rather than at a per-user level, and much more maintainable.

This is the way I'd like to see it implemented in the official tree,
maybe with a nagios.cfg option to allow you to set the default to be :rw
or :r (so that once you've fully implemented things you can change
default to be :r, best to default to the lower level)

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Open Relay Check Plugin?

2006-10-29 Thread Steve Shipway
> Ive looked in nagios exchange and couldnt find a plugin that 
> would check for
> an open relay and wondered if anyone knew if one existed. Ive 

I've written a plugin (check_rbl) that checks to see if your mail server
is listed on any of the internet blacklist (Spamhaus, spamcop, ORBS, etc
etc) and alerts you if this happens.

Of course, this is not quite what you were asking for, but (once you fix
your mail server) it lets you know if you ever get incorrectly blocked.
We check it every few hours.

Available from nagiosexchange.org.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] how to make check_snmp timeout CRITICAL

2006-10-23 Thread Steve Shipway



I achieve this by using the check_snmp options to test 
for a known value in the response string (eg: '.').  This way, I can alert 
if anyone stops the SNMP daemon or changes the community 
strings.
 
Steve
--Steve ShipwayITSS, University of Auckland(09) 3737 
599 x 86487[EMAIL PROTECTED]
 

  
  
  From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Pete 
  SiemsenSent: Tuesday, 24 October 2006 1:25 p.m.To: 
  nagios-users@lists.sourceforge.netSubject: [Nagios-users] how to 
  make check_snmp timeout CRITICAL
  I have a host that won't do ping but will do SNMP, so I want Nagios 
  to check this host's reachability using SNMP.  I don't care what SNMP 
  value I get, only that the host replies to SNMP.  I tried using 
  check_snmp to get the sysObjectID, and it works fine.  When the machine 
  is unreachable, check_snmp returns "SNMP problem - No data received from 
  host", which is fine, but Nagios still shows the host as "ok".  I assume 
  check_snmp is returning a status of OK or something, even though the SNMP 
  request totally failed.  What can I do?
  
  
  
  
  -- Pete
  
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Testing host notification recipients

2006-10-23 Thread Steve Shipway
Steve queried:
> > Hi, does anyone know a method (or a utility, etc) with which I can
test
> > who will receive notifications of a particular host/service  at a
given
> > time? 

Patrick replied:
> I don't recall if you can use the web interface in 1.4 to submit a
> passive service check, but that works in 2.0 to simulate a failure.

This is true, however it would cause email alerts to go out to all the
contacts and I'd have to then ask each person if they received one (I
can't just stop the email delivery as other things are going on).  Also,
notification rules change over time periods - I might want to know what
would happen out of hours without having to wait until after 6pm.  What
I really need is something to read the config files, untangle them, and
then give a response based on options passed.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Testing host notification recipients

2006-10-23 Thread Steve Shipway
Hi, does anyone know a method (or a utility, etc) with which I can test
who will receive notifications of a particular host/service at a given
time?

We have a rather large number of hosts, hostgroups, contactgroups, and
so on with a number of multiple time periods and it's getting somewhat
confusing.  I have one situation where I think the user should have
received an alert, but the Nagios log says no alert was sent out
(although other users WERE alerted).  I need some way to check Nagios'
logic, and either explain why no alert was sent or else point a finger
at a Nagios bug...

Using Nagios 1.4, at the moment.  Cant go to 2.x until next year.

Thanks in advance

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring VMWare

2006-10-19 Thread Steve Shipway

> Has anyone had any success in monitoring VMWare virtual servers. I can
> monitor the ESX but how do we get details of the virtual servers that
> are hosted on ESX so that data bears some resemblance to 
> reality. If we monitor from the virtual servers then obviously things
like 
> CPU readings are useless.

Yes, we are monitoring a lot of ESX servers, using the check_esx2 plugin
(written by myself, so I admit to some bias).  If you go to VMWorld06 in
November in LA, you should be able to attend a workshop showing how to
implement this.

For the benefit of other readers, it is true that CPU and any other
time-based stats acquired from the VM are useless.  This is because of
the way VMWare affects the virtual server clock and the way the sharing
is managed.  Not only that, but collecting these stats from the VM will
noticeably impact performance due to it causing kernel calls.  The best
way to gain these stats is either from SNMP on the Service Controller
(through the VMware agentx plugin for snmp), or from the /proc/vmware
tree on the Service Controller (this is how the vmkusage addin for
vmware does it).

You can obtain check_esx2 (v2.4) from www.nagiosexchange.org.  This will
allow you to alert on various things, including excessive ReadyTime (the
key CPU usage indicator), total CPU usage, and also on Memory active and
balloon/vmswap activity.  The check_esx2 script can also run in MRTG
mode in order to allow it to be used for MRTG graphing.

There is also an additional script which will generate RRDTool files and
dummy MRTG .cfg files that can be passed to a MRTG frontend such as
routers2 to display numerous summary graphs, but this is out of scope
for this mailing list.

See www.steveshipway.org/forum for the discussion forum for VMWare
monitoring using these tools.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] custom fields

2006-10-15 Thread Steve Shipway
Az wrote:
> Kyle Vorster wrote:
> > What i am trying to do is give a more detailed email 
> notification to my
> > clients.
> You can define your notification commands 
> (http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#command) with

> whatever you want in them using the relevant macros 

I think the idea was to get a more detailed description on a
*per-service* basis.  This was our problem, also.

The way I managed to get around it was to create a new program -
'buster'.

This has two parts - one, a web frontend that stores messages in a mysql
database, and two, an extraction utility that, when given hostname and
servicename, will spit out a message deduced from the database.  This
utility is embedded in the notify command.

The web frontend unfortunately doesnt have any extra security in it, so
anyone can edit any of the match templates.  Also, the extracted
messages are fixed text and dont have embedded replaceable symbols;  but
they can match on host, hstgroup, servicename, status, and using
wildcards.

I havent posted to Nagiosexchange because it would require a bit more
work to be generic enough for everyone else to use it easily; however if
youre perl-savvy email me and you can have a copy.

The name 'buster' comes from 'ghostbuster', since it is supposed to use
this extra text to tell the night ops who they're gonna call...

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] custom fields

2006-10-15 Thread Steve Shipway



> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Az
> Sent: Sunday, 15 October 2006 9:34 a.m.
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] custom fields
> 
> Kyle Vorster wrote:
> > What i am trying to do is give a more detailed email 
> notification to my
> > clients.
> You can define your notification commands 
> (http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#comma
> nd) with 
> whatever you want in them using the relevant macros 
> (http://nagios.sourceforge.net/docs/2_0/macros.html#arg) as required.
> 
> An out-of-the-box example...
> 
> # 'notify-by-email' command definition
> define command{
> command_namenotify-by-email
> command_line/usr/bin/printf "%b" "* Nagios  
> *\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: 
> $SERVICEDESC$
> \nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: 
> $SERVICESTATE$\n\nDate/Time: $DATETIME$\n\nAdditional 
> Info:\n\n$OUTPUT$" 
> | /bin/
> mail -s "** $NOTIFICATIONTYPE$ alert - $HOSTALIAS$/$SERVICEDESC$ is 
> $SERVICESTATE$ **" $CONTACTEMAIL$
> }
> 
> 
> 
> --
> ---
> Using Tomcat but need to do more? Need to support web 
> services, security?
> Get stuff done quickly with pre-integrated technology to make 
> your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on 
> Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&;
> dat=121642
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS 
> when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_mailq

2006-10-08 Thread Steve Shipway
> I'm guessing that this plugin is designed to run on the mail server
> itself which is running nagios /and/ qmail?

Yes.

> Checking the mail queue remotely is maybe not possible?  
> Maybe I'm just missing something?

Run it remotely via NRPE.  That's what we do.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring IIS server...

2006-10-04 Thread Steve Shipway

> > Management had two choices -
> > 1) Fix the configuration of the IIS server
> > 2) Stop monitoring it altogether
> > 
> > Which do you think they chose?  Sigh.
> > 
> 
> Since they're management, they actually had more choices:
> 3) Buy a new web-server and a loadbalancer, clone the old 
> configuration 
> and use the loadbalancer to keep session-id's from being eaten.
> 4) Buy more RAM for the old server.
> 5) Buy a new, beefier, server.

You must be psychic.  Actually, they'd already done (3) and (4) before
they decided on doing (2).  Doing (1) would have been too easy.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring IIS server...

2006-10-02 Thread Steve Shipway
> Has anyone seen as to where using check_http to monitor the 
> IIS server causes serious issues?
> I figure since all it's doing is  mapping port 80 for livelyihood, it 
> should be okay. (At this point, I don't need all the SNMP 
> information, just want to know that the service is open.)

We had an issue with an IIS server, where the site would allocate
session ID cookies when you connected that would time out after far too
long.  Since we'd be connecting every 5 mins or more frequently for
checks, this resulted in session IDs being eaten and the server running
low on resources.

I agree this is a hosed server with a stupid configuration set up by
brain-dead people with no more knowledge than an MCSE, but sadly that's
what you get sometimes.

Management had two choices -
1) Fix the configuration of the IIS server
2) Stop monitoring it altogether

Which do you think they chose?  Sigh.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios On VMWare machines

2006-10-01 Thread Steve Shipway



We use Nagios to monitor the service console (via 
check_esx2, from nagiosexchange) and the standard host MIB checks.  Then we 
use NRPE or pNSclient on the guests to monitor them (but not CPU, memory or 
network IO).  check_esx2 takes care of CPU use, readytime, and memory use 
checks.  Works fine here.
 
Steve
--Steve ShipwayITSS, University of Auckland(09) 3737 
599 x 86487[EMAIL PROTECTED]
 

  
  
  From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Thales 
  Maia ChagasSent: Saturday, 30 September 2006 9:47 
  a.m.To: nagios-users@lists.sourceforge.netSubject: 
  [Nagios-users] Nagios On VMWare machines
  
  
  Does  anybody running nagios 
  on VMWare machines? Experiencing any 
  problem?
   
  Esta mensagem, 
  incluindo seus anexos, pode conter informações privilegiadas e/ou de caráter 
  confidencial, não podendo ser retransmitida sem autorização do remetente. Se 
  você não é o destinatário ou pessoa autorizada a recebê-la, informamos que o 
  seu uso, divulgação, cópia ou arquivamento são proibidos. Portanto, se você 
  recebeu esta mensagem por engano, por favor, nos informe respondendo 
  imediatamente a este e-mail e em seguida apague-a.
   
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] how to loop event-handlers

2006-09-28 Thread Steve Shipway
> >> It is possible to call event handlers infinitely by 
> nagios? I mean if 

The event handler is called by nagios on state change (including count
of checks until hard state) but no other times.  So, if you want
something to be continually called then you might need to do it
separately.

You might like to try playing with the is_volatile=1, and
stalking_options=c to see if this results in multiple calls to the event
handler on every critical check rather than just on state changes?

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

 


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitor Windows Program

2006-09-10 Thread Steve Shipway


> How can I use Nagios to monitor a windows program that does 
> not run as a
> service. I can use snmp plugin to check on services but if it's a
> program that starts from an executable, how can I tell if its 
> running or not

If you are using the pNSclient agent (as we are) then this includes a
test for a named executable running, as well as a test for a particular
service name.  I believe NC_Net and nsclient++ also both have this
functionality.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks in Distributed mode

2006-08-21 Thread Steve Shipway
> My 2 current nagios servers are running on Dell 1750's. Each 
> has a 2.4GHz Xeon Processor and 2 Gigs of Ram. What type of 
> specs would be needed if I were to add a central server that 
> only deals with passive checks?  I am pretty sure I could 
> come up with a smaller server; would it be better to use that 
> for the centralized server, or move one the 1750's into that role?

We're running a single twin 2.4GHz Xeon server with 2GB of memory and
it's happily handling 3450 active checks and 133 passive checks.  I
suspect we could get up to 6000 at least before it started to struggle.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_http can't follow redirect properly, no cookies!

2006-08-17 Thread Steve Shipway
> > Check out the WebInject Nagios plugin.  I have it setup to check 
> > several pages in a secure website and it handles cookies.   
> It could 
> > save you the hassle of creating your our script.

We use webinject here for precisely this reason, and it works well.  It
also supports MRTG format output if you need that.

I had also written a homegrown plugin that parses the web page to pull
out session IDs from URLs and do a bit more magic, but the latest
webinject seems to do everything itself now.

Steve

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] New check script: check_nsca

2006-08-16 Thread Steve Shipway
Great idea, I never though of it for some reason.  I've just added
check_tcp on port 5667 to make this test - we use NSCA for receiving
SNMP traps and messages from NagEventLog, so these are reset to 'ok' by
a freshness check rather than to 'unknown'.  Checking NSCA is alive is a
good idea.

Steve

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] How to get stdout from event handlers into Nagiosemail messages?

2006-08-15 Thread Steve Shipway
>   Hi, I'm running Nagios 2.5 and my question is: is there 
> any way to capture the stdout of an event handler and get it 
> into the email Nagios sends out?

If you think about it, this must be impossible.

Since the event handler is called at the same time as the notifications,
the output (if any) is not avaiable when the notifications are queued as
the event handler has not yet run.

The only way I can see to do this is to code an email sender into your
event handler script.

You could fudge it by creating a dummy service for the same host that
uses freshness to immediately reset to OK and has a count of 1, and then
have your event handler send a passive alert (containing the output
text) to this service.  The dummy service (which has the same contacts
as the main service) then sends out a separate alert to the contacts
with this information.  You could probably even create a generic
eventhandler wrapper script to do this.

Steve

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios redundancy

2006-08-07 Thread Steve Shipway

> How Can I have redundancy in my nagios server?

We do it by having two servers, twin-tailed onto an external mirrored
SCSI disk pack, and then using Linux-HA to do failover between them with
a floating virtual IP and so on.  Works very well although both servers
need to be in the same physical location.  They also both have a local
unpublished secondary DNS server so that it minimises the load on the
central DNS and also protects against DNS failure.

Steve

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Tracking dynamic parents?

2006-07-30 Thread Steve Shipway
> I am monitoring several hundred virtual machines and they 
> move from time to time to different VMware host machines.  My 
> question is how can I easily update the parents for these?

We have this situation as well.

The way I get around it is to reconfigure the scripts daily.  I have a
reconfigurator which regenerates the config files for both MRTG and
Nagios, probing the SNMP on the VMWare server to identify the various
locations of the guests.  This is then used to set parent definitions,
and also make summary graphs for MRTG.

At the same time, it probes for agents, checks for various services, and
dynamically reconfigures the monitoring for certain things (theres a
central config file the holds a list of what should be there, in
addition).

We dont vmotion the guests very often, anyway, and the check_esx2 plugin
will log current active VMs so it also keeps track of any changes.

Sadly I can't share a copy of the config tool as it is too specific to
our site.

Steve

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Why do I get these socket timeouts?!?!

2006-07-27 Thread Steve Shipway
 > > CHECK_NRPE: Socket timeout after 10 seconds.

This is the check_nrpe timing out.  There are 3 or 4 timeouts to check
--

1)Service checks have a global timeout in nagios.cfg.  This is usually
about 30sec
2)check_nrpe has a timeout specified by -t.  This is usually 10sec.
3)The remote nrpe agent has a timeout, command_timeout in nrpe.cfg.
This is usually about 20sec.
4)The plugin being run remotely may have its own timeout specified on
the command line.

Normally, (1) should be the largest number. (3) should always be greater
than (4), and (2) should be greater than (4).  (3) should be larger than
the biggest (4) you have.  This will mean that you get timeout messages
from the plugins rather than from the Nagios server.

In your case, the timeout is coming from (2).  This would normally
indicate that your remote NRPE is having problems, but it may also mean
that your (2) timeout is smaller than your (3) or (4) timeouts.

On our server, (1) is set to 60sec, (2) is usually 20sec, (3) is 19sec
and (4) is 10sec.  This means that we receive more helpful timeout
messages, as we know at which point the problem is.

Steve

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] bind nagios server to a certain IP

2006-07-17 Thread Steve Shipway
> > > > Is it possible to get a nagios server to bind to an IP?
...
> > > > I want to get nagios to use one IP on a cluster so that 
> the other 
> > > > host
> > > > (active-passive) can take it over in the event of a failure.
...
> > I've set allowed_hosts in nrpe.conf - so the clients will 
> only accept 
> > checks from a certain IP.

We do this by just adding the IP addresses of every member of the
cluster, plus the cluster virtual IP, to the NRPE allowed_hosts lists
and switch access tables.  Otherwise it gets a bit awkward in having to
make your HA software (linux_HA?) change your default router to force
the virtual interface (and then change back on fail back).

Steve

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] syslog-ng process monitor plugin

2006-07-13 Thread Steve Shipway



check_procs will only check for processes on the local 
machine.  Use the -h option to see all the parameters you can give it to 
check for different things.  You will probably want to use NRPE to run the 
check_procs plugin on the remote server (this is how we do it 
here)
 
check_nrpe -H 172.2.23.1 -c check_procs -a syslog-ng 1 
1024
 
and then in the nrpe.cfg on the remote 
machine
 
command[check_procs]=/usr/local/nrpe/check_procs -w 
"1:$ARG2$" -c "1:$ARG3$" -C "$ARG1$"
 
Steve

  
  
  From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
  RadhikaSent: Friday, 14 July 2006 4:49 a.m.To: 
  Morris, Patrick; nagios-users@lists.sourceforge.netSubject: Re: 
  [Nagios-users] syslog-ng process monitor plugin
  
  thanks for your mail.I have checked the check_procs plugin but it is not 
  giving the option of monitoring remote host i may be wrong
   
  ./check_procs -H 172.2.23.1 -w 1:1 -c 1:1024 -C syslog-ng
   
  I am getting error ./check_procs: invalid option -- H
   
  If i remove -H and ipaddress it is checking in local machine
   
  How to fix this now
   
  thanksnew Yahoo! Mail Beta.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Reporting Transactions

2006-07-13 Thread Steve Shipway
> Does anyone have any recommendations for using Nagios to 
> consume or plug in something that can test end user 
> experience and report transaction steps to Nagios? I would 
> need this to test drilling down into web sites and Win32 apps. 
> Kinda like a transaction testing solution like Rational Robot.

Short answer - yes.  Webinject.  It's a free tool with Nagios *and* MRTG
support, and we use it here for several sites.  It's a bit awkward to
configure but it supports cookies and session IDs.

We also wrote one on-site that worked slightly differently, but ended up
using webinject instead.

If you have lots of these->$ then you can also get some commercial tools
like SiteScope that can send alerts to Nagios via NSCA (can call an
external script that calls send_nsca).  SiteScope is easier to
configure, but costs a lot and only runs under Windows.

Steve


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Filtering display of current status in CGI

2006-07-05 Thread Steve Shipway
Hi --

I need to get a display of the current problems to go up on the Operations'
status screen.  So far, I can do this with:

status.cgi?hostgroup=x-production&style=detail&servicestatustypes=248&servic
eprops=8202&hostprops=8202&noheader&sorttype=2&sortoption=3

which limits display to just the 'production' hostgroup (IE, none of the
test servers), selects just the unknown/warn/crit services, and filters out
anything which has alerts disabled or is in scheduled downtime.  Finally, it
is sorted in decending severity order.

However, there are a couple of problems!

Firstly, if a host is down, then it can have up to 20 services in various
states - all I want to know is that the host is down.  So, I've modified the
status.cgi slightly to just display the host status if the host is down or
unreachable (fairly easy to do).  At the same time, I've made the 'Attempt'
column go dark if it is in 'hard' state and faint otherwise.

Secondly, I'd like to hide alerts for things which are outside their
notification period, in the same way as hiding things with alerts disabled.
The operations will see an alert on the screen, and not realise that it is
outside of the notify period for this service - and will call the oncall
persion.  This is a bit harder and I haven't done it yet, although I notice
the required test functions are in the library.

What I'd like is for Nagios 2.x to incorporate both of these.  Maybe an
extra filter option to hostprops and serviceprops to say 'only if in valid
notify time'?  If Ethan is reading this, that would be on my wishlist :).
Does anyone have a way of achieving this already?

I've made most of these changes (plus changing to Euro date formats
throughout, and changing the defaults in cmd.cgi) for Nagios 1.4, but now I
have the big job of moving my patches to Nagios 2.x.  Having a site-specific
customisation is a pain anyway, it would be better to have these few patches
added to the dev tree.  Any Nagios development people about?

Steve



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check Exchange Queues

2006-07-02 Thread Steve Shipway



Sometimes, windows servers have problems with service 
order.  Make sure you have started exchanege before you start perfmon, so 
that perfmon can verify the counters from exchange  Or is it the other way 
around?  We have a problem with SQLServer where if you restart the (SQL) 
service all the perf counters disappear.  This is a microsoft... 
'feature'...
 
S teve 

  
  Is there any configuration I need to make on the Exchange 
  Server? (other than installing NSClient)
   
  I have setup this up on the nagios box but I am just 
  getting:
   
  0 messages queued Performance Data: '%.0f messages 
  queued'=0.00%;10.00;15.00 
  No matter how many messages are in the queue
   
  This is the check command that I am using:
   
  check_command  
  nt_exch_outq!10!15
   
  Thanks
  
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Check Exchange Queues

2006-06-29 Thread Steve Shipway



Here's my checkcommands.cfg 
section:
 
## 
Exchange Server#define command {    
command_name    
nt_exch_throughput    
command_line    $USER1$/check_nt -H $HOSTADDRESS$ -v COUNTER -l 
"\\MSExchangeMTA\\Messages/Sec","%.2f messages/sec" -w $ARG1$ -c $ARG2$}define command 
{    command_name    
nt_exch_inq    
command_line    $USER1$/check_nt -H $HOSTADDRESS$ -v COUNTER -l 
"\\MSExchangeIMC\\Queued Inbound","%.0f messages queued" -w $ARG1$ -c $ARG2$}define 
command {    
command_name    
nt_exch_outq    
command_line    $USER1$/check_nt -H $HOSTADDRESS$ -v COUNTER -l 
"\\MSExchangeIMC\\Queued Outbound","%.0f messages queued" -w $ARG1$ -c 
$ARG2$}
 
Steve

  I am monitoring a 
  Windows server that has Exchange Server 2003. I am checking (using NSClient) 
  the Information store service but would like to check on the queues. Is anyone 
  doing this at the moment or knows how to?
   
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring Cisco 3750 stacks - OIDs or traps ?

2006-06-25 Thread Steve Shipway
> If you want the switches to let you know when something 
> happens, use traps.  If you want to pull data at regular 
> intervals, use polling.
> 
> There's nothing to stop you from doing both.

We use both, although not on precisely this hardware.  On our foundry load
balancer and SAN, for example, we catch SNMP traps but also poll a few
health counters in case the traps were missed.  If you put in an snmptrapd
that intelligently parses caught traps its no difficulty to catch alerts.

As to *which* trap OIDs, or *which* OIDs to monitor, that's a different
question.  Presumably someone out there can help?

Steve



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Switch Port Monitoring

2006-06-25 Thread Steve Shipway
> Anyone like to share how they monitor switch ports?

Here, we use MRTG to graph the switch ports, and the routers2 frontend to
MRTG to make it pretty.  Also, the routers2 frontend has a Nagios plugin to
allow the Nagios data to be displayed as well -- and a portstatus plugin to
show the current individual port configuration and status.

For Nagios, we use a simple ping service and then use the hostextinfo url to
link to the MRTG graphs page.  On some switches, we also have a CPU use
service.  We also have an SNMP trap service and the switches send SNMP traps
to the Nagios host which are parsed and relayed to the appropriate service.
Port up/down etc generate traps which can in turn generate alerts.

Finally, for key switch ports, and additional service can be defined for
that particular port status OID.  It should be possible to do something
similar for port traffic thresholding, but we dont.

Steve



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Distributed Monitoring - Redundancy

2006-06-25 Thread Steve Shipway
> > I'm running Nagios is a distributed environment which is working very 
> > well. I would like to add a little redundancy to the 
> picture now that I have everything working. ;-)
...
> > It seems that a secondary "cold spare" might be the best solution.
> > Then there are maintenance issues with keeping software up to data, 
> > etc.
> No - look at linux HA (heartbeat) and drbd.
> > So many problems, so little beer.
> The linux HA/drdb setup is well understood and quite easy.

We use linux-HA here to have a redundant setup of two servers.  In fact, we
are running our Nagios on one and our MRTG on the other, and they both
provide failover for each other.  They both pass between each other a set of
virtual IPs, services, disks and filesystems.  Works very well, and is very
reliable.  I uses the v1.x linux HA (trather than the newer feature-rich
v2.x) as we only have a 2-machine failover cluster and simplicity makes
things easier.

We have an external SCSI disk pack connected to two adaptec serveRAID cards
(these helpfully have locking capabilities for just this setup).  There are
two LUNs on the external pack passed between the servers.

Heartbeat goes via serial cable, crossover network cable, and the main
network.  

For people who are really paranoid, I also have a little linux-ha plugin
which uses a tiny raw partition on the disk to effect an additional lock
before mounting the filesystem.

In a failover situation, we lose only about 30 seconds and everything is
fine.  Nagios (since it uses text files) is very stable - however, I also
run mysql on the Nagios server to hold archives and summarised logs, and
this passes back and forth with no difficulty as well.

If anyone would like detailed instructions, please contact me directly.

Steve



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] avaibility stats on Nagios server crash...

2006-06-11 Thread Steve Shipway
> When the server which hosts nagios2.x crashes (for instance, 
> electric cut), avaibility statistics are totally wrong, the 
> stop period is not catch by nagios (the last state stored by 
> Nagios is used to fill this time
> period...)
> > nothing appears in "Undetermined" section of statistics.
> Too bad.
> 
> 
> Do you have any solution to change this behavior ?
> Thanks.

Put your Nagios server onto a UPS.  That should igve it enough time to shut
down gracefully.  We have our Nagios on a failover linux-ha cluster with UPS
and this works well.

Steve




___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] check_esx version 2 released on Nagiosexchange

2006-05-28 Thread Steve Shipway
Version 2 of check_esx (to allow Nagios to check the health of a VMware ESX
server via SNMP) is now uploaded on Nagiosexchange.  This is a major change
on the old v1.4 and can check for many health problems such as balloon
memory, CPU ready time, VM swapping, and so on.  Also has an add-on for MRTG
graphing using the same method.

Thankyou for your time,
Steve




---
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


RE: [Nagios-users] monitoring esx vmfs

2006-05-19 Thread Steve Shipway
VMWare is a bit strange with this one.  Basically, vdf lists normal
filesystems plus the vmfs, but the NRPE check_disk plugin is not
vmware-aware and so cannot check them.

I'm doing it here via the SNMP which seems to work, but I've needed to write
a special plugin.  You might also like to notice that you can run NRPE
plugins via the SNMP daemon on an ESX server, so you don't actually need
NRPE at all.

Steve




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


RE: [Nagios-users] How to reduce a very high latency number

2006-05-17 Thread Steve Shipway
[Trask wrote]
> I am still butting up against very high latency issues with my Nagios 
> setup.  I feel like I must be missing something obvious because it 
> doesn't seem like I have so many services that the servers cannot keep up.

I've noticed we get this problem when there are more than one or two hosts
down.  Because Nagios (we use 1.2) does host checks first, and sequentially,
a host check timing out can hold up everything else (we have >3000 checks to
run every 5 minutes).  

To help out with this, I've reduced the timeouts and number of pings to
check hosts (so a host down takes less time to identify) and tried to
educate people to disable host checks when they know a host will be down for
a long time.

Finally, I do a restart of Nagios every day or so which resets the latency
back to 0.  Not ideal, but it helps.

I'd rather host checks were done in the same way as service checks, but I
can see why they aren't (to allow the system to not run service checks for
down hosts).

Steve




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


RE: [Nagios-users] Monitore Postfix queue

2006-05-16 Thread Steve Shipway
> I've write a small script do monitore postfix queue using 
> postqueue to count the number of emails on my queue:
> 
> postqueue -p | tail -1 | cut -d " " -f 5
> 
> The problem: Sometimes, my queue have more than 20.000 
> emails, and I can't count this number of emails in less then 
> 1 minute :-)

We had this problem, too.  So, I wrote a different plugin that uses Perl.
The quickest way to do this is to just count the number of files in the
postfix queue directory, although this requires the plugin to run with the
appropriate group permissions (which I achieved by using a sudo wrapper).

The postqueue (or mailq) commands take a lot longer because they parse the
messages in order to output all the extra information, which you arent
interested in.

Steve




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


RE: [Nagios-users] Hide certain 'Criticals'

2006-05-16 Thread Steve Shipway
...
> > You could set your view to filter out acknowledged alerts?  
> > This is an undocumented option to status.cgi.
...
> Thanks for the reply Steve, how would I filter out the 
> acknowledged alerts?
> After filtering them out, how would I know that they have 
> recovered, does it reset the acknowledgment when a device 
> comes back online?

Acknowledgements are reset once a service or host returns to OK status.

If you pass the option serviceprops to the status.cgi script, you can add a
filter to the services.  From cgiutils.h:

#define SERVICE_SCHEDULED_DOWNTIME  1
#define SERVICE_NO_SCHEDULED_DOWNTIME   2
#define SERVICE_STATE_ACKNOWLEDGED  4
#define SERVICE_STATE_UNACKNOWLEDGED8
#define SERVICE_CHECKS_DISABLED 16
#define SERVICE_CHECKS_ENABLED  32
#define SERVICE_EVENT_HANDLER_DISABLED  64
#define SERVICE_EVENT_HANDLER_ENABLED   128
#define SERVICE_FLAP_DETECTION_ENABLED  256
#define SERVICE_FLAP_DETECTION_DISABLED 512
#define SERVICE_IS_FLAPPING 1024
#define SERVICE_IS_NOT_FLAPPING 2048
#define SERVICE_NOTIFICATIONS_DISABLED  4096
#define SERVICE_NOTIFICATIONS_ENABLED   8192
#define SERVICE_PASSIVE_CHECKS_DISABLED 16384
#define SERVICE_PASSIVE_CHECKS_ENABLED  32768
#define SERVICE_PASSIVE_CHECK   65536
#define SERVICE_ACTIVE_CHECK131072

Just add together the options you want.  Something similar exists for
hostproperties.  Therefore, if you do

/nagios/cgi-bin/status.cgi?host=all&servicestatustypes=248&serviceprops=8202
&hostprops=8194

then you will get all services in warn, unknown or critical, where neither
the service nor its host is in scheduled downtime or has notifications
disabled, and the service problem has not been acknowledged.  If you add
'&noheader' to the end then you cal lose the big heading section and see
more alerts on the screen.

This is for Nagios 1.x, but probably is the same in Nagios 2.x.

Steve




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


RE: [Nagios-users] Hide certain 'Criticals'

2006-05-11 Thread Steve Shipway
> I would like to be able to acknowledge that the device is 
> down and be able to remove it from our view. Only when the 
> device comes back online then again goes offline it would 
> reset the acknowledgment and again show it on the screen as 
> 'CRITICAL' until it is acknowledged again.
> 
> I have acknowledged the problem but the device still shows up 
> on the screen in a 'CRITICAL' state. I have obviously 
> misinterpreted the docs and this does not work like I had assumed.

You could set your view to filter out acknowledged alerts?  This is an
undocumented option to status.cgi.  Alternatively, just disable alerts
permanently and manually enable later, and filter disabled alerts out on the
view (this is what we do here)

Steve




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


RE: [Nagios-users] Check Printer QUEUE on Windows w2k /2003

2006-05-10 Thread Steve Shipway



I looking for a method 
to check the printer queue on Windows server (like if more than 10 jobs i queue 
then warning or critical) I tried the 
snmp check /usr/local/nagios/libexec/check_snmp_win.pl (very good) but i can 
just test the "Service" Spooler. I 
looked att nagiosexchange.org and on google. NADA Anyone who has a clue how i can do this with SNMP or with 
NRPE ?  
We do it using pNSclient 2.x.  Then you can check any 
perfmon counter.
 
Alternatively, install the perfMIB stuff so that you can 
access the perfmon counters via SNMP, or install something like NC_Net which has 
both NRPE and pNSclient functionality.
 
Steve 


RE: [Nagios-users] any way to specify source address for server?

2006-05-09 Thread Steve Shipway
> > I'd like to give my Nagios server a virtual IP address, so that if we 
> > want to migrate the monitoring service to another machine in the 
> > future, it'll be easy (just setup the new machine and give 
> it the same virtual IP).
...
> This is a tricky issue. First of all I must assume you are 
> referring to the actions taken by the various checks. These 
> are in fact regulated by the individual plugins so there is 
> not a single way to solve this.

I understand what you're trying to achieve - just set up a single ACL on
your routers (for example) and have the Nagios service always query from
this (virtual) IP, wherever it actually lives.

For reasons stated before, this is almost impossible.  The only way I can
think of doing it is to set up a route definition on the Nagios host
directing all traffic via the virtual IP, although that is rather messy.

Here, we have Nagios on a virtual IP moving between two linux-ha hosts.
NSCA listens to a port on the virtual IP.  However, Nagios will query from
the physical IP of whichever host it is on, so we have had to set up both
physical IPs in the ACLs.

I suppose you could set up a special subnet for your monitoring servers, and
set your ACLs to allow that subnet.  Or, if only certain checks (SNMP?)
require this source IP, you might be able to modify the appropriate plugins
to specify source route.  However, it doesnt seem feasible to do it for all
the plugins.

Steve 




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Nagios Eventlog monitor for windows v1.8.1 released

2006-05-04 Thread Steve Shipway
The new version of the NagEvLog agent, for filtering and forwarding eventlog
entries on a Windows server to Nagios via NSCA, is now available from
http://www.steveshipway.org/software/ in the Nagios Utilities section, and
from NagiosExchange.

This has a few improvements on the previous versions -

1) New heartbeat funtion to periodically send 'agent is alive' alerts.
2) Fixed a couple of bugs in filtering code from v1.6
3) Install package now includes the course code
4) Install package should now be able to do an Upgrade Install for existing
users
5) Corrected looping error message bug from v1.8.0

Please report any problems, feature requests, and so on on the online
support forum at http://www.steveshipway.org/forum/

Thanks for your time,

Steve




---
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


  1   2   >