from:"Jonathan Call"

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-19 Thread Jonathan Call

Here is my $0.02:

I have a distributed Nagios2 system with 24,000+ service checks and 4000+ 
hosts. I rely heavily on NSCA to get the results from the slaves to the master. 
My issue seems to be with Nagios since I can't get a Nagios slave to process a 
mere thousand service checks using the documented method specified for NSCA 
before is starts overwhelming the server. I've had to resort to using the 
OCP_daemon method instead. No complaints about what NSCA does just with how 
poorly it seems to work within Nagios itself.


 -Original Message-
 From: Michael Medin [mailto:mich...@medin.name]
 Sent: Thursday, February 18, 2010 11:26 AM
 To: nagios-users
 Subject: [Nagios-users] NRPE/NSCA replacement thoughts?
 
 Hello
 
 Since I am pondering a replacement for the NSCA and NRPE protocol I
 thought I would get some thoughts from the community?
 So this is pretty much an open floor kind of thing to get some sense
 of what people actually need and would want (if anything at all).


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] When to HUP and when to restart?

2009-12-24 Thread Jonathan Call

If you’re using the embedded Perl interpreter a restart is probably better 
since the interpreter leaks memory.

If you have a very large solution (thousands of service checks) a restart will 
take a considerable amount of time so a HUP would probably be wise in that 
situation.

Jonathan

 -Original Message-
 From: Jim Avery [mailto:avery...@gmail.com]
 Sent: Thursday, December 24, 2009 6:32 AM
 To: nagios List
 Subject: [Nagios-users] When to HUP and when to restart?
 
 Thanks to Patrick mentioning you can send a HUP to get Nagios to
 reload it's config, (how on earth did I now know that??), it got me
 wondering...
 
 When, if at all, do I need to do a full restart of the Nagios daemon?
 
 Cheers and Happy Christmas everyone.
 
 Jim
 
 (p.s. I'm sorry if this is the second time you've seen this. I've been
 getting bounce notifications when posting to the nagios-users list so
 am trying again from my gmail address).
 
 ---
 ---
 This SF.Net email is sponsored by the Verizon Developer Community
 Take advantage of Verizon's best-in-class app development support
 A streamlined, 14 day to market process makes app distribution fast and
 easy
 Join now and get one step closer to millions of Verizon customers
 http://p.sf.net/sfu/verizon-dev2dev
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios2 process overwhelmed by NSCA daemon?

2009-12-14 Thread Jonathan Call

See responses inline:

 -Original Message-
 From: Thomas Guyot-Sionnest [mailto:derm...@aei.ca]
 Sent: Sunday, December 13, 2009 9:23 PM
 To: Jonathan Call
 Cc: nagios-user Mailinglist
 Subject: Re: [Nagios-users] Nagios2 process overwhelmed by NSCA
daemon?

 On 09/12/09 06:06 PM, Jonathan Call wrote:
  I recently added two new slaves to a distributed Nagios system. The
  central server now passively processes 17,000+ service checks on
 3000+
  servers.

  It's been over an hour and a half since I brought those new slaves
  online and I have about 150 hosts still stuck in 'Pending' and about
  1300 services in the same state. In addition to that it seems that
 the
  service check results from the other slaves that were working
 normally
  are now arbitrarily disappearing. For example, on one host three of
 the
  service checks have been updated relatively recently (i.e. 5-30
 minutes
  ago) but three other service checks haven't been updated for almost
 an
  hour. The slaves all appear operational and the hosts are being
 checked
  on time. Is it possible I've overwhelmed Nagios' ability to process
 data
  from the NSCA daemon or struck some internal Nagios bottleneck? Any
  suggestions would be appreciated.

 Hu Very interesting. Which Nagios version are you using?

Nagios 2.12 (May 19, 2008) on FreeBSD 6.3

 This sounds a lot like a problem I encountered a few years ago with
 passive checks. I had about 50-60 servers returning cron-scheduled
 check
 results to the Nagios server. 120 results ain't that much, but is
 seemed
 that with all the servers fully time-synced (using NTP) out of these
 ~120 results I was often missing some of them, which would eventually
 cause false-alarm due to stale services.

 I could easily reproduce the problem by feeding lots of results to
 Nagios right when I was expecting a batch of passive results - this
 would cause random results to be dropped. I spent some time trying to
 debug this but I couldn't figure our where commands were dropped. My
 primary target was the ring buffer used by the command reaper. As far
 as
 I can remember I tested with version of Nagios ranging from 2.3 to
2.5;
 I never tried with recent version

 If you're running a recent version of nagios what do you get for
 Used/High/Total Command Buffers in the nagiostats command output?
 (you can also get these numbers from the web interface, Performance
 Info in the left bar.). If it seems to be maxed out, you may try
 setting command_check_interval to -1 and raising the
 external_command_buffer_slots option in nagios.cfg.

Buffer report from Nagiostats:
Used/High/Total Command Buffers:  25 / 4096 / 4096
Used/High/Total Check Result Buffers: 0 / 4096 / 4096

Nagios config:
command_check_interval=-1
external_command_buffer_slots=4096

 If you're still having this problem with Nagios v3 and up I might try
 to
 reproduce this as well, and maybe I'll be able to figure out what's
 wrong this time.

Upgrading to Nagios v3 is being considered but isn't possible at this
time.

As I mentioned to someone else on this thread, it seems that having a
large number of queries (status.cgi) being run against the web interface
seems to provoke poor performance from the central server, this is even
after we switched the main objects.cache and status.dat files to a
memory disk.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios2 process overwhelmed by NSCA daemon?

2009-12-10 Thread Jonathan Call

Yes, Full Nagios is running on the slaves. They use OCP_daemon to pass on data 
to the central server since the NSCA client can't hack the load. They seem to 
be sending data properly to the NSCA daemon. 

Part of the issue I've tracked down to the status.cgi. The central server 
appears to be underpowered when it comes to both having Nagios process data AND 
have several people pounding out host/service status queries from the web 
interface. I will be adding another CPU to see if this helps, however I'm 
dismayed that Nagios on the central server doesn't seem to be reporting any 
errors, or indicating that there is any problem processing passive results. 
Nagios just starts to lose the data at a certain point.

Jonathan 

 -Original Message-
 From: Greg Pangrazio [mailto:pangr...@gmail.com]
 Sent: Thursday, December 10, 2009 7:26 AM
 To: Jonathan Call
 Cc: nagios-user Mailinglist
 Subject: Re: [Nagios-users] Nagios2 process overwhelmed by NSCA daemon?
 
 Are you running the full nagios on the slaves?  Do the checks seem
 to be working on those hosts?
 
 Greg Pangrazio
 pangr...@gmail.com
 
 
 
 
 
 On Wed, Dec 9, 2009 at 5:06 PM, Jonathan Call jc...@verio.net wrote:
  I recently added two new slaves to a distributed Nagios system. The
  central server now passively processes 17,000+ service checks on
 3000+
  servers.
 
  It's been over an hour and a half since I brought those new slaves
  online and I have about 150 hosts still stuck in 'Pending' and about
  1300 services in the same state. In addition to that it seems that
 the
  service check results from the other slaves that were working
 normally
  are now arbitrarily disappearing. For example, on one host three of
 the
  service checks have been updated relatively recently (i.e. 5-30
 minutes
  ago) but three other service checks haven't been updated for almost
 an
  hour. The slaves all appear operational and the hosts are being
 checked
  on time. Is it possible I've overwhelmed Nagios' ability to process
 data
  from the NSCA daemon or struck some internal Nagios bottleneck? Any
  suggestions would be appreciated.
 
  Jonathan
 
 
  This email message is intended for the use of the person to whom it
 has been sent, and may contain information that is confidential or
 legally protected. If you are not the intended recipient or have
 received this message in error, you are not authorized to copy,
 distribute, or otherwise use this message or its attachments. Please
 notify the sender immediately by return e-mail and permanently delete
 this message and any attachments. Verio, Inc. makes no warranty that
 this email is error or virus free.  Thank you.
 
  -
 -
  Return on Information:
  Google Enterprise Search pays you back
  Get the facts.
  http://p.sf.net/sfu/google-dev2dev
  ___
  Nagios-users mailing list
  Nagios-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/nagios-users
  ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
  ::: Messages without supporting info will risk being sent to
 /dev/null
 


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios2 process overwhelmed by NSCA daemon?

2009-12-09 Thread Jonathan Call

I recently added two new slaves to a distributed Nagios system. The
central server now passively processes 17,000+ service checks on 3000+
servers. 

It's been over an hour and a half since I brought those new slaves
online and I have about 150 hosts still stuck in 'Pending' and about
1300 services in the same state. In addition to that it seems that the
service check results from the other slaves that were working normally
are now arbitrarily disappearing. For example, on one host three of the
service checks have been updated relatively recently (i.e. 5-30 minutes
ago) but three other service checks haven't been updated for almost an
hour. The slaves all appear operational and the hosts are being checked
on time. Is it possible I've overwhelmed Nagios' ability to process data
from the NSCA daemon or struck some internal Nagios bottleneck? Any
suggestions would be appreciated.

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac 1.0.3 is Released! (A Nagios ConfigurationTool)

2009-10-12 Thread Jonathan Call

Any timetable for a STABLE release (i.e. not beta)?

Any timetable for supporting distributed deployments?

Jonathan

 -Original Message-
 From: Taylor Dondich [mailto:tdond...@gmail.com]
 Sent: Monday, October 12, 2009 1:10 PM
 To: nagios-user Mailinglist
 Subject: [Nagios-users] Lilac 1.0.3 is Released! (A Nagios
 ConfigurationTool)

 Lilac, the most popular Nagios Configuration tool has just released
 version 1.0.3.  This is a bugfix release which fixes over 30 bugs with
 improvements made by it's users.  As always, Lilac Configuration has
 the following features:

 * Advanced Nagios 3.x Timeperiod Support
 * Advanced Host and Service Templates (Even cooler than what Nagios
 supports by default!)
 * Flexible importer to import existing Nagios 2.x and 3.x
 configurations
 * Auto-discovery system powered by NMAP to quickly bring in new hosts.

 You can download the latest version of Lilac at
 http://www.lilacplatform.com/downloads

 Interesting note:  Lilac 1.0.2 which was released on 4/17/2009 was
 downloaded 5720 times!  That's an average of 32 times a day.  Thanks
 to the vibrant Lilac community!

 --
 Taylor Dondich
 Check out Lilac, a configuration tool for Nagios 3 at
 http://www.lilacplatform.com

 Check out my Shortcut with O'Reilly Press:
 Network Monitoring with Nagios:
 http://oreilly.com/catalog/9780596528195/index.html

 ---
 ---
 Come build with us! The BlackBerry(R) Developer Conference in SF, CA
 is the only developer event you need to attend this year. Jumpstart
 your
 developing skills, take BlackBerry mobile applications to market and
 stay
 ahead of the curve. Join us from November 9 - 12, 2009. Register now!
 http://p.sf.net/sfu/devconference
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NSCA speed problem

2009-09-10 Thread Jonathan Call

Have you considered OCP_daemon?

http://wiki.nagios.org/index.php/OCP_Daemon

 -Original Message-
 From: d...@chatham.org [mailto:d...@chatham.org]
 Sent: Tuesday, September 08, 2009 1:00 PM
 To: nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] NSCA speed problem

 I have a Nagios setup that is monitoring ~ 1000 hosts and ~ 13,000
 services.  The active checks are run on a Sun box with 128 CPUs/cores.
 Since it appeared that status.cgi could only be single threaded, it
 meant
 that the Sun box was slow in putting a page together, so all checks
 were
 forwarded to a fast Intel machine which puts together the page in about
 2
 seconds instead of about 16 on the SPARC.

 However, NSCA is now slowing the process, either on the sending or the
 receiving end.  There are only two NSCA processes running, so I suspect
 that this is the problem.

 I can think of a number of alternatives.  One would be to load up
 ndoutils, which looks like a fine solution, but I'm a but under the gun
 here and I'd really like to find something that works quickly.

 An alternative might be to use syslog to get the data from one machine
 to
 another.

 Any ideas, suggestions?

 ---
 ---
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008
 30-Day
 trial. Simplify your report design, integration and deployment - and
 focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Quick and easy way to monitor Nagios itself?

2009-09-04 Thread Jonathan Call

Since I have a large Nagios distributed system the possibility of a
Nagios process going AWOL on one of my many servers is a serious
concern. Has anyone come up with a sure way to confirm (i.e. a cron job)
that Nagios is processing checks properly? 

For example, I had one OCP_daemon process die, as a result the Nagios
process hung for quite some time before it was discovered. Freshness
checking is not an option because many hosts are behind firewalls or on
private networks and so the central server has active checks disabled
globally. 

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios 3.0.3 on FreeBSD defunct process

2009-04-02 Thread Jonathan Call

That sounds very familiar to the locking/contention issue FreeBSD 7.x has with 
Nagios 2.x. It has to do with how Nagios and FreeBSD handle threading. 
Unfortunately I don’t have any answers on how to fix it. I’ve had to leave my 
Nagios deployment on FreeBSD 6/Nagios 2 for the same reason: anything newer 
would lock up due to defunct Nagios processes. 

 

Jonathan

 

From: Gian Paolo Buono [mailto:gpbu...@gmail.com] 
Sent: Tuesday, March 31, 2009 5:45 AM
To: nagios-users@lists.sourceforge.net; lei chen
Subject: Re: [Nagios-users] nagios 3.0.3 on FreeBSD defunct process

 

Hi, 

I haven't NDOUtils and enable_embedded_perl is disable (enable_embedded_perl=0) 
:(.. any idea ? 

bye... 

2009/3/31 lei chen clo...@gmail.com

Are you use NDOUtils here？
Or use enable_embedded_perl option？

2009/3/27 Gian Paolo Buono gpbu...@gmail.com:

 Hi, Ihave a server with FreeBSD 7.1-RELEASE-p2 with 950 host and 4900
 service, Nagios 3.0.3

 Sometimes nagios don't update the status and when i try to stop nagios don't
 dies, i try to kill -9 the process but don't dies, there are many  defunct
 process of nagios  so I have to reboot the server. I haven't any log.

 Any idea ? thank you for the support bye..


 --

 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




--
Thanks,

Chenlei  石头++

MSN Messenger: c...@163.com

 



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool, has released 1.0 Release Candidate 1.

2009-03-10 Thread Jonathan Call

Our implementation is pretty much right out of the Nagios documentation. The 
only thing that might be 'special' is that one of our slave servers is actually 
running two instances of Nagios; each instance is considered a 'slave' to the 
master.

 -Original Message-
 From: Taylor Dondich [mailto:tdond...@gmail.com]
 Sent: Monday, March 09, 2009 10:28 AM
 To: Jonathan Call
 Cc: nagios-user Mailinglist
 Subject: Re: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool,has
 released 1.0 Release Candidate 1.

 That's on the map for 1.2.  The first thing to determine is, what is
 the best way to handle distributed environments properly.  Do we have
 a single master with multiple slave monitoring servers?  How do most
 people do distributed monitoring?

 Taylor

 On Mon, Mar 9, 2009 at 7:46 AM, Jonathan Call jc...@verio.net wrote:
  I don't see it mentioned anywhere so I thought I would ask,

  Does Lilac support distributed Nagios deployments?

  Jonathan

  -Original Message-
  From: Taylor Dondich [mailto:tdond...@gmail.com]
  Sent: Sunday, March 08, 2009 10:12 PM
  To: nagios-user Mailinglist
  Subject: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool,has
  released 1.0 Release Candidate 1.

  Lilac, the Nagios configuration tool with the MOST coverage of 3.x
  features, has released 1.0 release candidate 1.  This version
  features:
   - Multiple Template Inheritance
   - Advanced Timeperiod Definitions
   - Enhanced Templates (Attach services, dependencies, escalations to
  host templates, something you CAN'T do with regular Nagios config
  files)
   - Robust Auto-Discovery system
   - Import existing Nagios 2.x and Nagios 3.x configurations
   - Import configurations from existing Fruity installations
   - Export to Nagios 3.x, perform pre-flight checks and restart Nagios
  at
  will
   - Background Import/Export/Auto-Discovery processes (no need to wait
  at the browser for your exports/imports/discovery processes to take
  place)

  Take a look, join the community, and help build the most powerful
  configuration tool for Nagios out there!

  Downloads and Documentation is available at www.lilacplatform.com

  --
  Taylor Dondich
  Check out Lilac, a configuration tool for Nagios 3 at
  http://www.lilacplatform.com

  Check out my Shortcut with O'Reilly Press:
  Network Monitoring with Nagios:
  http://oreilly.com/catalog/9780596528195/index.html

  --

  Open Source Business Conference (OSBC), March 24-25, 2009, San
  Francisco,
  CA
  -OSBC tackles the biggest issue in open source: Open Sourcing the
  Enterprise
  -Strategies to boost innovation and cut costs with open source
  participation
  -Receive a $600 discount off the registration fee with the source
  code:
  SFAD
  http://p.sf.net/sfu/XcvMzF8H
  ___
  Nagios-users mailing list
  Nagios-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/nagios-users
  ::: Please include Nagios version, plugin version (-v) and OS when
  reporting any issue.
  ::: Messages without supporting info will risk being sent to /dev/null

  This email message is intended for the use of the person to whom it has
 been sent, and may contain information that is confidential or legally
 protected. If you are not the intended recipient or have received this
 message in error, you are not authorized to copy, distribute, or otherwise
 use this message or its attachments. Please notify the sender immediately
 by return e-mail and permanently delete this message and any attachments.
 Verio, Inc. makes no warranty that this email is error or virus free.
  Thank you.

 --
 Taylor Dondich
 Check out Lilac, a configuration tool for Nagios 3 at
 http://www.lilacplatform.com

 Check out my Shortcut with O'Reilly Press:
 Network Monitoring with Nagios:
 http://oreilly.com/catalog/9780596528195/index.html

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool, has released 1.0 Release Candidate 1.

2009-03-09 Thread Jonathan Call

I don't see it mentioned anywhere so I thought I would ask,

Does Lilac support distributed Nagios deployments?

Jonathan

 -Original Message-
 From: Taylor Dondich [mailto:tdond...@gmail.com]
 Sent: Sunday, March 08, 2009 10:12 PM
 To: nagios-user Mailinglist
 Subject: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool,has
 released 1.0 Release Candidate 1.
 
 Lilac, the Nagios configuration tool with the MOST coverage of 3.x
 features, has released 1.0 release candidate 1.  This version
 features:
  - Multiple Template Inheritance
  - Advanced Timeperiod Definitions
  - Enhanced Templates (Attach services, dependencies, escalations to
 host templates, something you CAN'T do with regular Nagios config
 files)
  - Robust Auto-Discovery system
  - Import existing Nagios 2.x and Nagios 3.x configurations
  - Import configurations from existing Fruity installations
  - Export to Nagios 3.x, perform pre-flight checks and restart Nagios
at
 will
  - Background Import/Export/Auto-Discovery processes (no need to wait
 at the browser for your exports/imports/discovery processes to take
 place)
 
 Take a look, join the community, and help build the most powerful
 configuration tool for Nagios out there!
 
 Downloads and Documentation is available at www.lilacplatform.com
 
 --
 Taylor Dondich
 Check out Lilac, a configuration tool for Nagios 3 at
 http://www.lilacplatform.com
 
 Check out my Shortcut with O'Reilly Press:
 Network Monitoring with Nagios:
 http://oreilly.com/catalog/9780596528195/index.html
 


--
 
 Open Source Business Conference (OSBC), March 24-25, 2009, San
Francisco,
 CA
 -OSBC tackles the biggest issue in open source: Open Sourcing the
 Enterprise
 -Strategies to boost innovation and cut costs with open source
 participation
 -Receive a $600 discount off the registration fee with the source
code:
 SFAD
 http://p.sf.net/sfu/XcvMzF8H
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Hosts report 'DOWN, HARD' after first attempt.

2009-01-16 Thread Jonathan Call

I am running a distributed monitoring system using Nagios 2.11 on
FreeBSD 6.3. I use NSCA to send host and services events to the central
server from the slave servers and have always had the following problem:

A distributed server notices a host service is non-Ok and fires off
check-host-alive. I have it set up to do check_ICMP and so it fires off
five ICMP packets. Since the network isn't always perfect those five
packets get dropped. However, I have my max_retry_interval set to 3 so
it fires off another check_ICMP which completes just fine. As a result I
see the following events take place on the slave server:

[01-16-2009 15:18:46] HOST ALERT: s3200.blah.net;UP;SOFT;2;OK -
10.XX.XX.XX: rta 100.294ms, lost 0%
[01-16-2009 15:18:46] HOST ALERT: s3200.blah.net;DOWN;SOFT;1;CRITICAL -
10.XX.XX.XX: rta nan, lost 100%

However on the central server I see the following:

[01-16-2009 15:19:02] HOST NOTIFICATION:
NOC-email;s3200.blah.net;UP;host-notify-by-email;OK - 10.XX.XX.XX: rta
100.294ms, lost 0%
 [01-16-2009 15:19:01] HOST ALERT: s3200.blah.net;UP;HARD;1;OK -
10.XX.XX.XX: rta 100.294ms, lost 0%
[01-16-2009 15:19:01] HOST NOTIFICATION:
NOC-email;s3200.blah.net;DOWN;host-notify-by-email;CRITICAL -
10.XX.XX.XX: rta nan, lost 100%
[01-16-2009 15:19:01] HOST ALERT: s3200.blah.net;DOWN;HARD;1;CRITICAL -
10.XX.XX.XX: rta nan, lost 100%

The central server is immediately flagging the host as DOWN, HARD in
spite of having the same max_retry_interval = 3 setting. On some hosts
this is generating a tone of false HOST DOWN notifications. Is there
any way to fix it?

Jonathan Call




This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hosts report 'DOWN, HARD' after first attempt.

2009-01-16 Thread Jonathan Call

 -Original Message-
 From: Patrick Morris [mailto:patrick.mor...@hp.com]
 Sent: Friday, January 16, 2009 11:40 AM
 To: Jonathan Call
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Hosts report 'DOWN, HARD' after first
attempt.

...

 I'm not sure exactly how you're passing check results to the central
 server, but you may want to consider modifying the process to only
send
 host check results when they are in a hard state.

That sounds like an excellent recommendation. Here is my host check
command:
$USER1$/custom/submit_host_check_result.sh $HOSTNAME$ $HOSTSTATEID$
'$HOSTOUTPUT$'

I'll need to modify it to be like this:

$USER1$/custom/submit_host_check_result.sh $HOSTNAME$ $HOSTSTATEID$
'$HOSTOUTPUT$' '$HOSTSTATETYPE$'

And then my NSCA host script would then become:

--
#!/bin/sh

# Arguments and corresponding NAGIOS API variable
#  $1 = $HOSTNAME$
#  $2 = $HOSTSTATEID$
#  $3 = $HOSTOUTPUT$
#  $4 = $HOSTSTATETYPE$
#
# The variables must be piped in as tab delimited variables
# with a newline termination

if [ $4 = HARD ]; then
   /usr/bin/printf %s\t%s\t%s\n $1 $2 $3 |
/usr/local/sbin/send_nsca XXX.XXX.XXX.XXX -c
/usr/local/etc/send_nsca.cfg
fi

# Do nothing for SOFT

--

Thank you,

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hosts report 'DOWN, HARD' after first attempt.

2009-01-16 Thread Jonathan Call

 -Original Message-
 From: Marc Powell [mailto:m...@ena.com]
 Sent: Friday, January 16, 2009 1:20 PM
 To: nagios-users Mailinglist
 Subject: Re: [Nagios-users] Hosts report 'DOWN, HARD' after first
attempt.

 On Jan 16, 2009, at 12:40 PM, Patrick Morris wrote:

  The max_check_attempts only applies to active checks, not the
passive
  ones you're sending the central server (at least I assume when you
  said
  max_retry_interval you meant max_check_attempts)  -- and you may
note
  that SOFT and HARD are only relative to the server doing the
checking;
  they probably aren't passed as part of the passive check submission
  process.

 Correct, all passive host checks are assumed to be HARD states. Note
 that this is addressed in nagios-3 --

http://nagios.sourceforge.net/docs/3_0/configmain.html#passive_host_chec
ks
 _are_soft

 --
 Marc

If they're all assumed to be SOFT, then a host failure would never
trigger a notification?

Another potential option, if you're not using NSCA (like those using the
OCP_daemon) is to have the slave servers send out the notification
emails instead of the central one. The slaves would be active monitors
and would honor the host's max_check_attempts variable. This of course
introduces other problems if the slave is behind a restrictive firewall.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios 3.0.6 on 10.5.6 Server

2009-01-16 Thread Jonathan Call

 -Original Message-
 From: Randall R. Saeks [mailto:rsa...@district30.k12.il.us]
 Sent: Friday, January 16, 2009 1:55 PM
 To: nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] Nagios 3.0.6 on 10.5.6 Server

 Ever since I upgraded my server running Nagios 3.0.6 to 10.5.6, I
 can't get Nagios to launch.  When I try to start it via the CLI
 command, the following gets returned in the nagios.log:

 [1232136591] Nagios 3.0.6 starting... (PID=27107)
 [1232136591] Local time is Fri Jan 16 14:09:51 CST 2009
 [1232136591] LOG VERSION: 2.0
 [1232136591] Finished daemonizing... (New PID=27109)
 [1232136591] Error: Could not create external command file
'/opt/local/
 var/nagios/rw/nagios.cmd' as named pipe: (22) - Invalid argument.  If
 this file already exists and you are sure that another copy of Nagios
 is not running, you should delete this file.
 [1232136591] Bailing out due to errors encountered while trying to
 initialize the external command file... (PID=27109)

 I've deleted said file, but no such luck in getting it to run.

 While the web-interface is there and running, if I go to tactical
 overview or any of the other menus say that Nagios isn't running
 (which makes sense since the app bails).

 Does anyone have any ideas on this and something to check out / try?
 I have installed this through Macports.

 Thanks

 Randy Saeks, ACSA
 Network  Server Administrator
 Northbrook / Glenview School District 30

Is it me or do you have a space in that file path?

'/opt/local/ var/nagios/rw/nagios.cmd'

Looks like a typo in your nagios.cfg file.

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac 1.0 beta 1 released! The most robust Nagios3.x configuration tool available!

2008-12-24 Thread Jonathan Call

Does Lilac support distributed configuration? I looked over the site
briefly and did not see any such capability.

Jonathan

 -Original Message-
 From: Taylor Dondich [mailto:tdond...@gmail.com]
 Sent: Wednesday, December 10, 2008 12:45 PM
 To: nagios-user Mailinglist
 Subject: [Nagios-users] Lilac 1.0 beta 1 released! The most robust
 Nagios3.x configuration tool available!

 Lilac 1.0 beta 1 is NOW released.  Featuring a robust importer for
 importing existing Nagios 2.x and 3.x configurations, an exporter to
 export to Nagios 3.x and a robust Auto-Discovery system.  Downloads
 and documentation at http://www.lilacplatform.com

 Thanks to everyone for their extensive testing of Alpha and filing
 bugs and suggestions.  Beta 1 is *very* awesome.

 --
 Taylor Dondich
 Check out Lilac, a configuration tool for Nagios 3 at
 http://www.lilacplatform.com

 Check out my Shortcut with O'Reilly Press:
 Network Monitoring with Nagios:
 http://oreilly.com/catalog/9780596528195/index.html

--

 SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas,
 Nevada.
 The future of the web can't happen without you.  Join us at MIX09 to
help
 pave the way to the Next Web now. Learn more and register at

http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.
co
 m/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NSCA and Latency

2008-10-23 Thread Jonathan Call

NSCA just doesn't scale well within Nagios. 

 

You will need to try something like the OCP Daemon mentioned here:
http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

 

I believe Andreas Ericsson has also written a broker module for NSCA. It
is apparently still in its testing/alpha stages so you would have to
contact that person directly.

 

Jonathan

 

 



From: Maxwell,Brady [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 23, 2008 8:42 AM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] NSCA and Latency

 

My Environment:

3 x Dell 2950 Dual DualCore and 8 GB of RAM

One system runs checks against our Linux servers

One runs checks against our Windows servers

We are running SLES10 update 3

Both systems use nsca to send their check results to a third server that
displays the service checks for our operators.

All three systems are on the same vlan but separate cisco switchs.

I am running nsca in daemon mode on the central server with this command

/usr/local/nagios/bin/nsca -c /usr/local/nagios/etc/nsca.cfg -daemon

Nsca.cfg is as follows:

pid_file=/var/run/nsca.pidserver_port=5667#server_address=192.168.1.1nsc
a_user=nagiosnsca_group=nagios#nsca_chroot=/var/run/nagios/rwdebug=1comm
and_file=/usr/local/nagios/var/rw/nagios.cmdalternate_dump_file=/usr/loc
al/nagios/var/rw/nsca.dumpaggregate_writes=1append_to_file=1max_packet_a
ge=300password=xxdecryption_method=14

 

I just set the aggregate and append options to try and fix the problem
they were not set before either way the results are the same.

Ok so on the 2 servers doing the checks Everything runs fine even
with the OCSP running my send_service_check_results script. My script is
pretty much straight out of the book.

#!/bin/sh# Arguments:# $1 = Hostname of the host (using the $HOSTNAME$
macro)# $2 = Service description of the service (using the $SERVICEDESC$
macro)# $3 = Service status id of the service (using the
$SERVICESTATUSID$ macro)# $4 = Output of the Service Check (using the
$SERVICEOUTPUT$ macro)/bin/echo $1,$2,$3,N3 - $4 |
/usr/local/nagios/libexec/send_nsca -H 10.10.129.37 -c
/usr/local/nagios/etc/send_nsca.cfg -d ,

Like I said everything is fine on the 2 servers even with OCSP on.
Between the 2 servers we are running about 10k service checks, latency
is very low just a few seconds. However if I turn on the NSCA Deamon on
the central server my latency creeps up to about 1500+ seconds with in
an hour and just gets worse from there on both remotes. The checks that
should run every 5 minutes on the 2 remote servers end up running every
few hours or less. The central server is doing 0 active checks.

I set debug mode and that proved to provide very little insight into the
problem.

CPU and Mem stats are both very low on all three server. The same thing
can be said for the network, network utilization is less than 2% and
there are no errors on the interfaces. Overall hardware utilization is
10% or less on these three systems. 

So my question is has anyone had this kind of problem with NSCA? What am
I missing? Should I be batching my service checks on the remote servers?
Should I be using xinetd for NSCA instead of deamon mode?

Thanks

Brady



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] intermittent CGI failure

2008-09-25 Thread Jonathan Call

-Original Message-
From: Jon Angliss [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 24, 2008 5:52 PM
To: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] intermittent CGI failure

On Mon, 15 Sep 2008 12:29:55 -0400 (EDT), [EMAIL PROTECTED] wrote:

Installed 3.0.3 from source on OpenBSD 4.3 (sparc64).  Everything
works,
but every so often the CGI's will fail.

e.g. If I refresh, say, status.cgi?host=all 10 times in a row, it'll
fail
at least once or twice.

I can reproduce using both Apache and nginx.

Here's Apache error log snippet: Premature end of script headers:
/var/www/nagios/cgi-bin/status.cgi

Here's nginx error log snippet: upstream closed prematurely FastCGI
stdout while reading response header from upstream

Familiar issue to anyone?  Next steps to debug?

I keep noticing this every now and again.  I usually have the tactical
overview page open on a third monitor, and Firefox often throws a
cannot connect message.  I'm assuming this is probably the same
thing, just on a different page.  I've yet to go through my logs to
confirm though.

-- 
Jon Angliss

I'm running into the same issue:
[Tue Sep 23 04:49:28 2008] [error] [client xxx.xxx.xxx.xxx] Premature
end of script headers: status.cgi

My server is FreeBSD 6.3 running Nagios 2.12 and Apache 2.2.8. The error
is very intermittent, maybe two or three times a day on a 24/7
monitoring screen. Some times it is a 'cannot connect' sometimes the
screen is just blank.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios 3 distributed monitoring and NSCA

2008-09-10 Thread Jonathan Call

In Nagios 2.x Nagios the Obessive Compulsive Service Processor (OCSP) is
not very robust. Even with a few hundred service checks the OCSP stuff
on the distributed servers bogs down and does not send anything out.
This forced people like me to use tools like OCP_daemon. 

Has the OCSP infrastructure improved in Nagios 3? I need it to be robust
enough to handle ~2500 service checks.

Jonathan



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Anyone tried Nagios 3.0.3 on FreeBSD yet?

2008-09-09 Thread Jonathan Call

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dave
Horsfall
Sent: Monday, September 08, 2008 4:46 PM
To: Nagios Users
Subject: Re: [Nagios-users] Anyone tried Nagios 3.0.3 on FreeBSD yet?

On Mon, 8 Sep 2008, Sean McAfee wrote:

 This has long been in the ports tree as nagios-devel - it was just 
 pending the repocopy of 2.x to nagios2.  I've been running it since b2

 with no issues outside of the libthr one 

(http://www.freebsd.org/cgi/getmsg.cgi?fetch=662046+0+/usr/local/www/db/
text/2008/cvs-ports/20080120.cvs-ports).

Interesting; on FreeBSD 7 (and I think 6) libpthread is symlinked to 
libthr:

lrwxr-xr-x  1 root  wheel   8 Jul 20 16:07 libpthread.a - libthr.a
lrwxr-xr-x  1 root  wheel   9 Jul 20 16:07 libpthread.so -
libthr.so
lrwxr-xr-x  1 root  wheel  10 Jul 20 16:07 libpthread_p.a -
libthr_p.a

That is the case in FreeBSD 7. But not FreeBSD 6.

 I honestly haven't noticed many changes (outside of cfg_dir recursion 
 working correctly).  On the completely anecdotal side, it does seem to

 be more efficient overall but I think that's related to general 
 improvements on the 3.x branch.

Well, it hasn't hung yet...

Did it ever hang while running Nagios 2? My current FreeBSD 7.0 (amd64)
box has not been able to run Nagios 2.12_1 as smoothly as my FreeBSD 6.3
(i386) can. And the FreeBSD 7.0 server has a significantly fewer number
of services too. I'm trying to figure out if upgrading might help.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Anyone tried Nagios 3.0.3 on FreeBSD yet?

2008-09-08 Thread Jonathan Call

I noticed the port change a few days ago. Anyone tried it?

Does it behave better than Nagios 2 on FreeBSD 7?

Jonathan Call
Network Engineer - NTT/Verio
(801) 437-7476



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios gets stuck

2008-09-04 Thread Jonathan Call

Yes I have. And it is very annoying. A service check goes defunct and
the thread hangs, which makes Nagios hang. The defunct service check,
its thread parent remain as unkillable zombies until the server is
rebooted.

No one has offered any sort of solution other than Have you tried
Nagios 3? (Which I have not)

Jonathan

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dave
Horsfall
Sent: Wednesday, September 03, 2008 5:40 PM
To: Nagios Users
Subject: [Nagios-users] Nagios gets stuck

Nagios 2.12 with 1.4.11 plugins, on FreeBSD 7.0.

Sometimes Nagios hangs, and does not accept external commands etc; it's 
necessary to kill -9 the process.  Has anyone else seen this?  Next
time 
I'll try and get a coredump of the process.

-- 
Dave Horsfall DTM VK2KFU  Ph: +61 2 9552-5509 (direct) +61 2 9552-5500
(switch)
Corinthian Eng'ng P/L, Ste 54 Jones Bay Whf, 26-32 Pirrama Rd, Pyrmont
2009, AU


-
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
Build the coolest Linux based applications with Moblin SDK  win great
prizes
Grand prize is a trip for two to an Open Source event anywhere in the
world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios gets stuck

2008-09-04 Thread Jonathan Call

I am running the default scheduler (SCHED_4BSD) with SMP. 

I have one box running FreeBSD 7.0-amd64 and three others running
FreeBSD 6.3-i386 in a distributed model. The FreeBSD 6.3 boxes had
issues in the past with service checks hanging but once Nagios was
libmapped to libthr instead of libpthread those issues went away.

I've been tempted to try that on the amd64 system but I'm waiting for
Nagios to hang/fail again.

Jonathan

-Original Message-
From: Sean McAfee [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 04, 2008 8:16 AM
To: Jonathan Call
Cc: Dave Horsfall; Nagios Users
Subject: Re: [Nagios-users] Nagios gets stuck

Jonathan Call wrote:
 Yes I have. And it is very annoying. A service check goes defunct
and
 the thread hangs, which makes Nagios hang. The defunct service
check,
 its thread parent remain as unkillable zombies until the server is
 rebooted.

 No one has offered any sort of solution other than Have you tried
 Nagios 3? (Which I have not)

 Jonathan
Jonathan, are you running BSD as well? 

If so, what scheduler are you using? How about you Dave?

Sean McAfee
System Engineer

Collaborative Fusion, Inc.
 [EMAIL PROTECTED]
 412-422-3463 x 4025

5849 Forbes Avenue
Pittsburgh, PA 15217



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] FreeBSD 7 and Nagios 2.12

2008-08-15 Thread Jonathan Call

I'm running FreeBSD 7 (amd64 at that) and Nagios 2.12.

It ran great for about a month. And then today I found that Nagios had
stopped processing checks and there are a few unkillable processes
lingering.

I remember at least one other person posting something similar to this.
Has anyone found a solution?

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] FreeBSD Nagios 2.12

2008-06-23 Thread Jonathan Call

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Marc Powell
 Sent: Sunday, June 22, 2008 10:31 AM
 To: nagios-user Mailinglist
 Subject: Re: [Nagios-users] FreeBSD Nagios 2.12

 On Jun 20, 2008, at 12:16 PM, Andrew D wrote:

  get_raw_command_line() start
  Input: check_dns
  clear_argv_macros() start
  clear_argv_macros() end
  find_command() start
  Output: $USER1$/check_dns -H www.yahoo.com -s $HOSTADDRESS$
  get_raw_command_line() end
  process_macros() start
  process_macros() end

  And thats where it stops and locks.

 Thanks. After this, nagios makes a call to the event broker,
 determines if the plugin is a perl plugin (to use ePN if enabled),
 then forks the check_command.

 Did you compile with the event broker? Is it enabled? Maybe try a
 compilation without the event broker/embedded perl interpreter.

 There is (was?) a known issue with FreeBSD related to pthreads that
 may be in play (Known Issues -
 http://nagios.sourceforge.net/docs/2_0/whatsnew.html
 , Google for 'nagios freebsd pthreads'). It does specifically relate
 to the forking of check processes and a hang. I do not recall what the
 current status of that issue is but remember chatter about it either
 here or on nagios-devel. Perhaps on of the other FreeBSD users can
 chime in on that. My feeling is that it was fixed or there's a
 workaround but I don't remember specifics.

 --
 Marc

There was a change made to the Nagios port in February where it opted
for libthr instead of libpthread when available. That was supposed to
make permanent/official a workaround that used /etc/libmap.conf to link
them instead:

[nagios]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

You could try using this libmap config or possibly reversing it. It may
be that FreeBSD 5 behaves better with libpthread? I've never used
FreeBSD 5 in any production environment so I'm just guessing.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] FreeBSD 7.0-RELEASE and nagios 2.10

2008-03-14 Thread Jonathan Call

I did an upgrade of my two test Nagios boxes to FreeBSD 7.0. I rebuilt
all of the ports. I'm also still* using the libmap.conf options:

[nagios]# Resolve fork/vfork issues with Nagios
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

Both of my test boxes now have zombie processes that do not respond to
any kill command:
nagios 46134  0.0  0.0 0 0  ??  Z11:43PM   0:00.06 defunct
nagios 46133  0.0  0.0 0 8  ??  DE   11:43PM   0:00.02
/usr/local/bin/nagios -d /usr/local/etc/nagios/nagios.cfg

Has anyone else run into this issue? 

Jonathan Call

* The port Makefile was updated with the following

USE_AUTOTOOLS=  autoconf:261 libltdl:15 -- Link with libthr when
available.  This should fix the CPU consumption problem. So I don't know
if the libmap.conf entry is necessary anymore?


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Is Embedded Perl + Nagios::Plugin worth it?

2008-02-05 Thread Jonathan Call

I've got one server that is getting tanked right now.  (Load average in
the 50's) Is it worth it to rewrite the many perl scripts I have to use
Embedded Perl and the Nagios::Plugin CPAN module?

I'm speaking in terms of performance and also in terms of future Nagios
releases/compatibility.

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] FreeBSD 6.3 and Nagios

2008-01-29 Thread Jonathan Call

I can't find it right now but someone sent out an email saying they
could not get Nagios 2.10 to run under FreeBSD 6.3. I've upgraded two
systems to FreeBSD 6.3, both are running Nagios 2.10 without any
problems. I didn't even have to recompile the port. I do have the
/etc/libmap.conf entries for libthr though. 

Jonathan



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios-process with 100% CPU after updatetoNagios-2.10

2008-01-21 Thread Jonathan Call

I just installed Nagios 2.10 on a FreeBSD 6.3 server. It just has the
localhost config on it right now but it runs without any problems. 

 -Original Message-
 From: Bernd Kuhlen [mailto:[EMAIL PROTECTED]
 Sent: Friday, January 18, 2008 9:24 AM
 To: Jonathan Call
 Cc: Michael W. Lucas; nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] nagios-process with 100% CPU after
 updatetoNagios-2.10

 Sorry I didn't, it just wanted to make sure it's something to do with
6.3.

 Bernd

 Jonathan Call schrieb:
  Did you try my libmap suggestion? I'd be surprised to learn that
  something in FreeBSD 6.3 breaks Nagios. There just aren't that many
  changes. I'm building 6.3 right now to find out though.

  Jonathan

  -Original Message-
  From: [EMAIL PROTECTED]
[mailto:nagios-users-
  [EMAIL PROTECTED] On Behalf Of Michael W. Lucas
  Sent: Thursday, January 17, 2008 3:01 PM
  To: Bernd Kuhlen
  Cc: nagios-users@lists.sourceforge.net
  Subject: Re: [Nagios-users] nagios-process with 100% CPU after
  updatetoNagios-2.10

  On Thu, Jan 17, 2008 at 10:52:19PM +0100, Bernd Kuhlen wrote:

  Hi Jonathan

  I fixed it by rolling back to FreeBSD6.2, now Nagios is stable

  again.

  HELLO OUT THERE, PLEASE DO NOT TRY TO UPGRADE TO FREEBSD6.3 IF

  YOU'RE

  RUNNING NAGIOS! AT LEAST NOT AT THE MOMENT.

  Seems to be a serious bug.

  I'd definitely bring this up on the freebsd-stable mailing list,
then.

  I'm running 2.10 on 6-stable and 8-current, no troubles.

  ==ml

  --
  Michael W. Lucas   [EMAIL PROTECTED],

  [EMAIL PROTECTED]

 http://www.BlackHelicopters.org/~mwlucas/
Now Shipping: Absolute FreeBSD --

  http://www.AbsoluteFreeBSD.com

  On 5/4/2007, the TSA kept 3 pairs of my soiled undies for security
  reasons.

  -

  This SF.net email is sponsored by: Microsoft
  Defy all challenges. Microsoft(R) Visual Studio 2008.
  http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
  ___
  Nagios-users mailing list
  Nagios-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/nagios-users
  ::: Please include Nagios version, plugin version (-v) and OS when
  reporting any issue.
  ::: Messages without supporting info will risk being sent to
/dev/null

  This email message is intended for the use of the person to whom it
has
 been sent, and may contain information that is confidential or legally
 protected. If you are not the intended recipient or have received this
 message in error, you are not authorized to copy, distribute, or
otherwise
 use this message or its attachments. Please notify the sender
immediately
 by return e-mail and permanently delete this message and any
attachments.
 Verio, Inc. makes no warranty that this email is error or virus free.
 Thank you.

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios-process with 100% CPU after updatetoNagios-2.10

2008-01-18 Thread Jonathan Call

Did you try my libmap suggestion? I'd be surprised to learn that
something in FreeBSD 6.3 breaks Nagios. There just aren't that many
changes. I'm building 6.3 right now to find out though.

Jonathan


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Michael W. Lucas
 Sent: Thursday, January 17, 2008 3:01 PM
 To: Bernd Kuhlen
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] nagios-process with 100% CPU after
 updatetoNagios-2.10
 
 On Thu, Jan 17, 2008 at 10:52:19PM +0100, Bernd Kuhlen wrote:
  Hi Jonathan
 
  I fixed it by rolling back to FreeBSD6.2, now Nagios is stable
again.
 
  HELLO OUT THERE, PLEASE DO NOT TRY TO UPGRADE TO FREEBSD6.3 IF
YOU'RE
 RUNNING NAGIOS! AT LEAST NOT AT THE MOMENT.
 
  Seems to be a serious bug.
 
 I'd definitely bring this up on the freebsd-stable mailing list, then.
 
 I'm running 2.10 on 6-stable and 8-current, no troubles.
 
 ==ml
 
 --
 Michael W. Lucas  [EMAIL PROTECTED],
[EMAIL PROTECTED]
   http://www.BlackHelicopters.org/~mwlucas/
   Now Shipping: Absolute FreeBSD --
http://www.AbsoluteFreeBSD.com
 On 5/4/2007, the TSA kept 3 pairs of my soiled undies for security
 reasons.
 


-
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios-process with 100% CPU after update toNagios-2.10

2008-01-17 Thread Jonathan Call

Sounds like the fork/vfork issue with FreeBSD's libpthread and Nagios.

The only solution I know of is to add the following to /etc/libmap.conf
and then do a stop/start of Nagios:

[nagios]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

This forces Nagios to use an alternative POSIX threads library instead
of FreeBSD's default thread library.

Jonathan

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Bernd Kuhlen
 Sent: Thursday, January 17, 2008 5:34 AM
 To: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] nagios-process with 100% CPU after update
 toNagios-2.10
 
 Hi Bernd
 
 oops, I sort of forgot the main thing. The actual problem:
 
 Whenever this process occurs it's like a denial of service-attack. No
 checks are performed whatsoever. The only workaround (that I know) is
to
 have a cronjob running once per minute finding and killing these jobs
to
 make Nagios running properly again.
 
 
 - Bernd Kuhlen (bkuhlen)
 
 ---
 This thread is located in the archive at this URL:
 http://www.nagiosexchange.org/nagios-
 users.34.0.html?tx_maillisttofaq_pi1[showUid]=8408
 
 


-
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processes hanging - Nagios 3rc1 on FreeBSD

2007-12-28 Thread Jonathan Call

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Alex French
 Sent: Friday, December 28, 2007 9:00 AM
 To: Chris Haulmark; nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Processes hanging - Nagios 3rc1 on FreeBSD

 On 28/12/2007, Chris Haulmark [EMAIL PROTECTED] wrote:

  I ran into this same problem with the beta versions.  I created a
file
  as the workaround.  This bug has been reported but apparently is
lost
  in the email archives with no official bug support.

  try this:

  add those lines to /etc/libmap.conf
  [/usr/local/bin/nagios]

  libpthread.so.2   libthr.so.2

  libpthread.so libthr.so

 Yes, this fixed the problem for me (although in FreeBSD 5.4 I needed
 to use .so.1 rather than .so.2).

 Thanks very much for your assistance, I probably wouldn't have figured
 that one out :-)

 Is there any mechanism to report this as a bug?

 Alex

The FreeBSD port maintainer is already aware of the problems of Nagios2
vs. libpthread. The last time I talked with him (which was a long time
ago) he was debating if he wanted to add an option to link the Nagios
build directly to libthr or just put in a warning in the post install
text. I think the pending FreeBSD 7.0 release sort of left this decision
in limbo.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How to monitor temp on Cisco 7200's?

2007-10-31 Thread Jonathan Call

Hello Patrick 

The Cisco 3560 and 3550 are very different from the Cisco 7200. You
cannot get an actual temperature value from them, just a temperature
state.

You'll need to use this SNMP Table:
ciscoEnvMonTemperatureState   .1.3.6.1.4.1.9.9.13.1.3.1.6
Possible states are: 
1:normal 2:warning 3:critical 4:shutdown 5:notPresent 6:notFunctioning

The 3560 and 3550 only have one thermal sensor so it's
1.3.6.1.4.1.9.9.13.1.3.1.6.1

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Patrick M.
 Sent: Wednesday, October 31, 2007 10:21 AM
 To: nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] How to monitor temp on Cisco 7200's?
 
 Hi all,
 
 I was wondering if anyone had any recommendations on what plugins to
use
 to check temperature for a Cisco 7200 router.  If possible I'd like to
 monitor temp on our other devices like our 3560 Catalyst switches as
 well as our 3550's.
 
 I tried looking on NagiosExchange but the plugins I found didn't
monitor
 temp for these devices.  Has anyone else had any luck?
 
 Thanks in advance.
 
 Patrick
 


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Another Nagios Problem.

2007-10-17 Thread Jonathan Call

Are you aware of the fork/vfork issue between Nagios and the FreeBSD
pthread library? This may be causing your problem.

Try using these /etc/libmap.conf entries:
[nagios]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

You will need to restart Nagios for the settings to take effect.


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Fulton, David
 Sent: Wednesday, October 17, 2007 12:43 PM
 To: Hugo van der Kooij
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Another Nagios Problem.
 
 I never said I wouldn't supply the coders with what they need. I would
 expect that those who coded it could point me in the right direction.
I
 have thus far tried changing how I get my data into perfparse and
 changing my timeperiods so that there are no overlapping times (i.e.
 from 00:00 - 24:00 to 00:00-23:59) since it always happens overnight
and
 the problem crops up after midnight I have to wait until then to get
 more data. Other than that I am using FreeBSD 6.2 with the latest
 plugins (1.4.10), nrpe, nsca and perfparse (the performance data is
send
 to files via a command definition that calls a perl script that writes
 it to a file. Perparse picks it up via a cron job that runs every 5
 minutes).
 
 The purpose of the nagios-users list is to obtain help when one gets
 stuck not to have someone tell them how they should be able to do
their
 own support. I have set up a complex piece of software and have been
 running it since version 3.0b1. To my knowledge, there are only so
many
 sources of information that I could provide. Nagios doesn't stop,
 doesn't run a particular command. It simply starts orphaning check
 results after midnight every day. Turning on debugging does not give
any
 indication as to why. If I truss (strace) the process it immediately
 spawns a new copy of itself that consumes all CPU time on whatever CPU
 it is running on without returning anything.
 

(snipped for brevity)

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed monitoring Freshness checkingfailing then recovering

2007-10-16 Thread Jonathan Call

Sean;

I have a very large deployment so I use this tool:

http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

This daemon runs on each of the distributed servers while a normal ncsa
daemon listens on the central server.
 
Jonathan

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Sean McAvoy
 Sent: Monday, October 15, 2007 12:09 PM
 To: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Distributed monitoring Freshness
 checkingfailing then recovering
 
 On further investigations it looks as though the problem is with the
 time taken to submit the results back to nagios via send_nsca.
 I have read about a couple different options for getting results back
 quickly. One being a bulk system of transfer, a file containing the
 results is sent via a send_nsca bulk transfer executed via cron. The
 other being a system that makes use of the performance data output
 option on the remote nagios systems and submits the results using a
 custom daemon on both ends.
 Does anybody know of any other options? Also, is there any guides to
 setting up either of these options, most of what I have read is email
 threads..
 Thanks.
 
 On 12-Oct-07, at 12:40 PM, Sean McAvoy wrote:
 
  Hello,
  I have 1 central nagios system with 5 distributed servers. I have
  enabled freshness checking on both central and remote systems. I am
  constantly seeing services go to unknown status for 1-3 minutes and
  then recover.
  on the remotes I have:
  check_service_freshness=1
  service_freshness_check_interval=10
  check_host_freshness=1
  host_freshness_check_interval=60
  service_inter_check_delay_method=s
  max_service_check_spread=10
  service_interleave_factor=1
  host_inter_check_delay_method=s
  max_host_check_spread=30
  max_concurrent_checks=0
 
  It does appear as though checks are being run in parallel. I'm
wonder
  how I can best determine where the problem is, with the execution of
  checks, submittal to the central system or other.
  Thanks.
 
 
  _sean
 
 
--
  ---
  This SF.net email is sponsored by: Splunk Inc.
  Still grepping through log files to find problems?  Stop.
  Now Search log events and configuration files using AJAX and a
  browser.
  Download your FREE copy of Splunk now  http://get.splunk.com/
  ___
  Nagios-users mailing list
  Nagios-users@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/nagios-users
  ::: Please include Nagios version, plugin version (-v) and OS when
  reporting any issue.
  ::: Messages without supporting info will risk being sent to
/dev/null
 
 Sean McAvoy
 NOC Acting Team Lead
 Afilias Canada
 
 P. 416.673.4194
 
 
 
 


-
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a
browser.
 Download your FREE copy of Splunk now  http://get.splunk.com/
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Config management - Reinventing the wheel

2007-06-22 Thread Jonathan Call

I currently use nagiosweb (http://sourceforge.net/projects/nagiosweb/)
to maintain a Nagios configuration for a central server in mysql. Based
off of certain host groups I want to generate configuration files for
distributed Nagios servers for that central server.

Has anyone written code (for example, perl) to generate distributed
Nagios 2.x configuration files based on a central Nagios server's
configuration that is stored in a mysql database? It doesn't have to be
nagiosweb. I believe that any db style would be easy enough to change to
make it work.

I thought I would ask to see if I could avoid reinventing the wheel.

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with FreeBSD and Nagios

2007-06-20 Thread Jonathan Call

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Douglas K. Rand
 Sent: Tuesday, June 19, 2007 3:16 PM
 To: Kyle Sexton
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios

 Doug The following entry in /etc/libmap.conf has, for us, solved the
 issue
 Doug of run away Nagios processes.

 Doug [nagios]
 Doug libpthread.so.2 libthr.so.2
 Doug libpthread.so   libthr.so

 Doug This is on FreeBSD 6.2.

 Kyle Was there a recompile or anything necessary?

 No. You do have to stop and restart the nagios process after the
 edit. A restart via the web interface is not sufficient. libmap.conf
 is a runtime configuration.

It's been about 24 hours since I implemented this dependency mapping on
one of my more heavily used FreeBSD 6.2/Nagios 2.9 servers. I have not
had any problems with child processes and my load average actually
dropped from around 7.5 to 4.

I'll give it a week or two before I declare it a complete success, but
it has been great so far!

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with FreeBSD and Nagios

2007-06-19 Thread Jonathan Call

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:nagios-users-
 [EMAIL PROTECTED] On Behalf Of Michael W. Lucas
 Sent: Tuesday, June 19, 2007 5:16 AM
 To: Kyle Sexton
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios

 On Mon, Jun 18, 2007 at 06:42:18PM -0500, Kyle Sexton wrote:
  On 12/14/06, Andreas Ericsson [EMAIL PROTECTED] wrote:
   Jonathan Call wrote:

Given your ideas and some google work I seem to have found my
 problem:

http://lists.freebsd.org/pipermail/freebsd-hackers/2005-
 August/013247.ht
ml

Not a pretty discussion. :(

   Nope. Definitely not.

   The problem for Nagios is that threading was added after the fact
so
   nagios actually breaks some of the *strong* recommendations on
what to
   do and what not to do in a threaded application after a fork().

   The problem for *BSD and their thread implementation of the thread
   library is that Nagios actually works everywhere but on *BSD, and
it
   *often* works there too, but not always. This
often-but-not-always
 is
   usually a sign of a broken implementation, although exactly
   often-but-not-always is a sign of the errors you'll run into
when
 you
   do what Nagios does post-fork().

   I don't know of any other program that has the same problem on
*BSD,
 but
   it would be interesting to see if there's a common pattern so one
can
   pinpoint the exact pattern that causes the lock contention and
races.
 It
   would, from a practical point of view, be best to patch it in the
   library, as that is a fix that would work for all possible future
   problems as well, although it's technically more correct to fix it
in
   Nagios.

   Ugly discussion indeed.

I'll try using a non SMP kernel to see it might help. If it
doesn't
 this
pretty much renders Nagios useless on FreeBSD. (Which makes me
 wonder
why they even bother maintaining it in ports?)

   Out of curiousity, do you use passive checks, active checks or a
mix
 of
   both in your setup?
  Was there ever a solution found to this problem?

No. 
I was forced to implement a distributed model and limit the service
checks to less than 1000 on a server. Even then I still have to run a
cron job that checks for nagios children than are spinning on the CPU as
a result of this fork issue.

I've found that somewhere after 1500+ service checks there will be a
random weekly event that causes almost a hundred nagios checks to hit
this fork issue all at the same time and promptly tank the FreeBSD
server.

 Skimming the (long) discussion thread, my first thought is to try
 libthr instead of libkse.  The discussion seems to be on 5.x, I'd
 definitely try libthr on 6.x.  Check libmap.conf for details.

Are you referring to this type of mapping within /etc/libmap.conf?

[/usr/local/bin/nagios]
 libpthread.so.2 libthr.so.2
 libpthread.so   libthr.so

If so I'd be willing to try it on my FreeBSD 6.2 server.

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Passive monitoring is running slow?

2007-05-02 Thread Jonathan Call

 -Original Message-
 From: Thomas Guyot-Sionnest [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 01, 2007 4:29 PM
 To: Jonathan Call
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Passive monitoring is running slow?

 On 01/05/07 05:15 PM, Jonathan Call wrote:
  I have set up a distributed monitoring system per the Nagios
 documentation.

  I initially tested it out by having the distributed server monitor
only
 24 or so services on about 8 hosts. There didn't seem to be any
problems.

  I then cranked it up to 427 services on 81 hosts. I'm watching the
 distributed server right now and there is hardly any system load but
the
 Service Check Latency seems extremely high:

  Metric  Min.Max.Average
  Check Execution Time:   0.05 sec1.67 sec0.701
sec
  Check Latency:  60.40 sec   287.36 sec  184.514
sec
  Percent State Change:   0.00%   0.00%   0.00%

  This is resulting in 50% or less of the service checks completing in
the
 5 minutes or less timeframe.

  The Central server has had no significant change in performance at
all
 and seems to be receiving and processing everything without
difficulty.

  The nsca server on the central server is running with the following
 arguments:
  /usr/local/sbin/nsca --daemon -c /usr/local/etc/nsca.cfg

  The submit_check_result script on the distributed server is right
out of
 the documentation.

 There are many ways to do that; my favorite (obviously since I wrote
it
 :) ) is using the host and service performance data files as named
 pipes, and having a daemon reaping them and batch-sending data to
 send_nsca..

 The howto is here (and I'll be more than happy to answer your
questions
 or get your feedback):

 http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

 It will require Libevent and the Perl module Event::Lib.

 Thomas

So this is a know design failure in Nagios then? I'm fairly new to
Nagios and I am completely dumbfounded at this. If you can't service
even a quarter (and probably even a tenth) of the amount of hosts and
services on a distributed server than you can on a regular active server
then what is the point of having a distributed model at all?

I will take a look at your batch sending method.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Passive monitoring is running slow?

2007-05-01 Thread Jonathan Call

I have set up a distributed monitoring system per the Nagios documentation.

I initially tested it out by having the distributed server monitor only 24 or 
so services on about 8 hosts. There didn't seem to be any problems.

I then cranked it up to 427 services on 81 hosts. I'm watching the distributed 
server right now and there is hardly any system load but the Service Check 
Latency seems extremely high:

Metric  Min.Max.Average
Check Execution Time:   0.05 sec1.67 sec0.701 sec
Check Latency:  60.40 sec   287.36 sec  184.514 sec
Percent State Change:   0.00%   0.00%   0.00%

This is resulting in 50% or less of the service checks completing in the 5 
minutes or less timeframe.

The Central server has had no significant change in performance at all and 
seems to be receiving and processing everything without difficulty.

The nsca server on the central server is running with the following arguments:
/usr/local/sbin/nsca --daemon -c /usr/local/etc/nsca.cfg

The submit_check_result script on the distributed server is right out of the 
documentation.

Encryption within nsca has been reduced to simple XOR with a password.

Is there any way to optimize the send_nsca features or is that high of a 
Service Check Latency not a big deal? 

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Breakdown of status.cgi?

2007-03-23 Thread Jonathan Call

Is there some documentation somewhere that breaks down the possible
variables and options available to status.cgi?

For example, what are all the possible binary operands for
servicestatustypes or style?

I'm trying to create a Current Network Status view that will be more
appropriate for NOC people than the Tactical Overview page.


Jonathan

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Preferred configuration utility?

2007-01-19 Thread Jonathan Call

Looking over nagiosexchange I see several web based configuration
management utilities. 

I've been using NagiosWeb; it's simple, somewhat immature, but
effective. It lacks the ability to import configurations, which has
become more important now that I'm moving my Nagios deployment to a
distributed model.

I looked at Fruity, but the forums for it have pointed out a showstopper
problem with importing and exporting configurations (especially
contacts)

Monarch has also been mentioned but their website appears to be
non-functional on sourceforge right now.

Has anyone found one that stands out among the others?

Jonathan Call
Network Engineer - NTT/Verio


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with FreeBSD and Nagios

2006-12-14 Thread Jonathan Call

nagios# gdb --pid=$74056
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for
details.
This GDB was configured as i386-marcel-freebsd.
/var/spool/nagios/rw/ is not a core dump: File format not recognized
(gdb) bt
No stack.
(gdb)

Given your ideas and some google work I seem to have found my problem: 

http://lists.freebsd.org/pipermail/freebsd-hackers/2005-August/013247.ht
ml

Not a pretty discussion. :(

I'll try using a non SMP kernel to see it might help. If it doesn't this
pretty much renders Nagios useless on FreeBSD. (Which makes me wonder
why they even bother maintaining it in ports?)


 -Original Message-
 From: Andreas Ericsson [mailto:[EMAIL PROTECTED]
 Sent: Thursday, December 14, 2006 2:26 AM
 To: Jonathan Call
 Cc: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios
 
 Jonathan Call wrote:
  I scanned the mailing list trying to find a solution for this. I
found a
  brief discussion where someone had the same problem but there was
  nothing really discussed what was potentially wrong.
 
  My system:
  Dual 2.8GHz P4 processors
  4GB of RAM
  FreeBSD 6.1-RELEASE-p10
 
  Running processes:
  Nagios 2.6 (installed from ports without embedded perl or nanosleep)
  One mysqld process for the nagiosweb utility
  A few NSCA daemon processes for passive checking
  A backup tool daemon
  Apache+modssl (latest from ports)
  Basic FreeBSD services (sshd, sendmail, etc.)
 
  Problem:
  Random service and host check control processes will lock up and
'spin'
  on the CPU. This is really bad when a host check does it because it
  brings all checks to a halt. It doesn't seem to even notice that all
  checks have gone stale.
 
  It will look like this in top:
 
PID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU
  COMMAND
  94068 nagios  1 1160  7500K  6748K CPU2   0 727:37 30.15%
nagios
  94082 nagios  1 1160  7500K  6748K CPU2   0 734:28 32.55%
nagios
  94104 nagios  1 1160  7500K  6748K CPU2   0 845:21 37.42%
nagios
  75338 nagios  5  200  7500K  6776K kserel 0  91:33  0.00%
nagios
 
  In this example the main nagios pid is 75338. The hung service
and/or
  host processes are the other ones.
 
  The service checks are almost entirely custom scripts, but the host
  check is a standard check_ping that comes with the nagios program.
 
  Any ideas on how to figure out which service or host check is hung?
Or
  how to deal with this problem at all?
 
 
 Host and service checks going into infinite loops wouldn't show up as
 Nagios processes in CPU spinlock, as the nagios check execution
children
 just sit around and wait for the child to finish (or 60 seconds to
pass
 in default config, before it kills it off).
 
 You've found a bug in Nagios which most likely was either introduced
in
 the port of it, or is a result of library differences between FreeBSD
 and Linux.
 
 I wouldn't be all too surprised if it turns out that the FreeBSD
pthread
 implementation disallows something that the Linux version allows. Note
 that this doesn't necessarily have to be a bug; Nagios doesn't use the
 pthread ABI in a way that is explicitly stated as safe, but the
pthread
 implementation on Linux and most other unices are forgiving enough to
 make it work anyway.
 
 It's also possible that this bug only triggers on dual-CPU systems
with
 a particular library installed, as some kinds of timing and
 race-conditions just doesn't happen on single-CPU systems.
 
 What happens if you do
 
 $ gdb --pid=$(pidof spinning-nagios-process)
 (gdb) bt
 
 ?
 
 --
 Andreas Ericsson   [EMAIL PROTECTED]
 OP5 AB www.op5.se
 Tel: +46 8-230225  Fax: +46 8-230231

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Problems with FreeBSD and Nagios

2006-12-13 Thread Jonathan Call

I scanned the mailing list trying to find a solution for this. I found a
brief discussion where someone had the same problem but there was
nothing really discussed what was potentially wrong.

My system: 
Dual 2.8GHz P4 processors
4GB of RAM
FreeBSD 6.1-RELEASE-p10

Running processes:
Nagios 2.6 (installed from ports without embedded perl or nanosleep)
One mysqld process for the nagiosweb utility
A few NSCA daemon processes for passive checking
A backup tool daemon
Apache+modssl (latest from ports)
Basic FreeBSD services (sshd, sendmail, etc.)

Problem:
Random service and host check control processes will lock up and 'spin'
on the CPU. This is really bad when a host check does it because it
brings all checks to a halt. It doesn't seem to even notice that all
checks have gone stale.

It will look like this in top:

  PID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU
COMMAND
94068 nagios  1 1160  7500K  6748K CPU2   0 727:37 30.15% nagios
94082 nagios  1 1160  7500K  6748K CPU2   0 734:28 32.55% nagios
94104 nagios  1 1160  7500K  6748K CPU2   0 845:21 37.42% nagios
75338 nagios  5  200  7500K  6776K kserel 0  91:33  0.00% nagios

In this example the main nagios pid is 75338. The hung service and/or
host processes are the other ones.

The service checks are almost entirely custom scripts, but the host
check is a standard check_ping that comes with the nagios program.

Any ideas on how to figure out which service or host check is hung? Or
how to deal with this problem at all?

Jonathan Call
Network Engineer - NTT/Verio
801.437.7476



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

44 matches

Mail list logo