Re: [Nagios-users] Large Installation

2010-06-11 Thread Andreas Ericsson
On 06/10/2010 07:51 PM, Scott Ward wrote:
 We are looking to do an large installation of Nagios. Is it possible to
 monitor over 800 machines and over 14000 services?
 
 Has anyone tried doing anything like this? If you have how successful was it
 and how did you configure it?
 

We have plenty of customers with far more than 1000 hosts. 800 should just be
a matter of running Nagios on a decently beefy hardware. Don't attempt it
with a virtual system though. They have notoriously crappy performance with
multi-fork()'ing applications, and if you ever hit the swap, they'll degrade
even further.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Large Installation

2010-06-11 Thread Scott Ward
We are going to be using distributed monitoring for sure.  We just cannot
decide whether we should use NDO to write directly to the database or us
NSCA to send back to the master server.  Any suggestions?

Is there a frontend that actually uses the information in an NDO db? From
what I've read it looks like the default Nagios front end uses text files.

~Scott Ward


On Fri, Jun 11, 2010 at 4:48 AM, Martin Melin nag...@martinmelin.comwrote:

 On Thu, Jun 10, 2010 at 21:55, Kevin Keane subscript...@kkeane.comwrote:

 Config file maintenance can be improved to some extent with careful design
 of the config files, as well as tools. It is an issue that I am running into
 with a relatively small installation with 80+ hosts and 400+ services. My
 installation is highly heterogeneous and very dynamic, which makes config
 file maintenance a nightmare. Having to restart Nagios after a configuration
 change doesn’t help either. On the other hand, a network with 2000 identical
 machines is probably going to be much easier to manage than my type of
 network.

 Nitpicking or helpful tip, you decide: Nagios reloads config changes on
 SIGHUP, you don't have to do a restart. A full restart can take a while on a
 sufficiently sized installation so having to do one for every change would
 indeed be a PITA, but I've never seen a reload take more than a few seconds.

 Cheers
 Martin


 --
 ThinkGeek and WIRED's GeekDad team up for the Ultimate
 GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
 lucky parental unit.  See the prize list and enter to win:
 http://p.sf.net/sfu/thinkgeek-promo
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Large Installation

2010-06-11 Thread Andreas Ericsson
On 06/11/2010 03:04 PM, Scott Ward wrote:
 We are going to be using distributed monitoring for sure.  We just cannot
 decide whether we should use NDO to write directly to the database or us
 NSCA to send back to the master server.  Any suggestions?
 
 Is there a frontend that actually uses the information in an NDO db? From
 what I've read it looks like the default Nagios front end uses text files.
 

Unless you desperately need performance data from satellite systems
handled properly, I'd invite you to give Merlin and Ninja a try.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] NDOUTILS - Duplicate lines for each service check in servicechecks table (ndoutils 1.4b9)

2010-06-11 Thread shadih rahman
Can someone point out the patch location.  I have searched nagios-devel
mailing list but could not find it.  Thanks in advance.

On Tue, Nov 10, 2009 at 6:09 PM, Michael Friedrich 
michael.friedr...@univie.ac.at wrote:



 Øyvind Nordang wrote:
  Duplicate lines for each service check in servicechecks table
  (ndoutils 1.4b9)
 
  I have:
  Nagios 3.2.0
  NDOutils 1.4b9
 
  Is this a bug of feature?
 I've attached the patch to nagios-devel, dunno when it will be fixed
 then. But I have analyzed more queries, and there are several other
 tables missing unique constraints and therefore causing duplicate rows -
 e.g. systemcommands while testing perfdata output ...

 Maybe another patch will follow but first I will first on Icinga :-)

 Kind regards,
 Michael




 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus
 on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




-- 
Cordially,
Shadhin Rahman
--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Merlin/Ninja perfdata status?

2010-06-11 Thread Frost, Mark {PBC}
 -Original Message-
 From: Andreas Ericsson [mailto:a...@op5.se] 
 Sent: Friday, June 11, 2010 9:29 AM
 To: Nagios Users List
 Subject: Re: [Nagios-users] Large Installation
 
 
 Unless you desperately need performance data from satellite systems
 handled properly, I'd invite you to give Merlin and Ninja a try.

Andreas,

We're planning on a Nagios refresh/rearchitecture near the end of this year
and I'm really hopeful that we might be able to move to Ninja/Merlin as they
do a lot of things we'd really like to have.  They also solve some issues we
have with our current distributed system.

I've been trying to pay attention to the latest developments in this area, but
I may have missed something as changes are happening quickly.

We do, however, rely pretty heavily on performance data.  I think I saw someone 
had
a hack to do it with Merlin, but it's not really part of Merlin right now which 
makes
me not want to adopt it for a production Nagios installation.

I recall a sort of Merlin roadmap for the rest of the year indicating that 
upcoming
work was to better support distributed setups, if I remember correctly.  Is 
there also
work afoot to get perfdata into Merlin perhaps with the next release?

I'm trying to build some test systems to try the current version of 
Merlin/Ninja to
assess how production ready it might be for us by the end of the year when we 
need
to make a decision.

Thanks very much for all the hard work you and others at Op5 have put in to 
these
tools.

Mark

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Large Installation

2010-06-11 Thread Romain Le Merlus
Hi Scott,

You can also try Centreon software to manage your different pollers and
configuration:
http://www.centreon.com

Here is an overview of the functioning:
http://en.doc.centreon.com/CentreonArchitecture

To see how it looks like, here is a web demo:
http://demo.centreon.com

Best regards.
-- 
Romain LE MERLUS

rlemer...@merethis.com
Tel. +33 (0)1 49 69 97 12
Mob. +33(0)6 85 05 02 82
--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios Postemsg

2010-06-11 Thread steve f

Hello All,

I am currently looking for an alternative to using Tivoli , TEC  postemsg for 
a rather large ( 6000 + ) remote environment.

I have had great success with Nagios in my small local/remote test environment 
and the obvious cost savings without having TEC anymore is huge.

Can I use the existing postemsg tests that are running on the boxes and via I 
guess External Commands have Nagios process the messages?

For those familiar with both Tivoli  Nagios, Is there anything that Tivoli 
gives me that I cant do with Nagios?  I don't see it if there is.


Thanks for the help,

Steve  
  
_
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
Hotmail. 
http://www.windowslive.com/campaign/thenewbusy?tile=multicalendarocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Large Installation

2010-06-11 Thread Andreas Ericsson
On 06/11/2010 04:08 PM, Scott Ward wrote:
 How does Merlin compare with NDO in terms of resource usage?
 

merlin is fairly lightweight. What little memory its uses resides
primarily on the stack and fits well inside the stack of 1MiB.

Here's the output of ps wwaux | grep merlin on a master system
with two connected pollers. As you can see, grep consumes more
memory than the merlin daemon does. This is with debug symbols
compiled in btw, so it will be roughly half that when it's built
for production.
root 12286  0.0  0.2  61116   660 pts/0R+   17:29   0:00 grep -i merlin
root 23236  0.0  0.7  50572  1856 ?S13:56   0:01 
/opt/monitor/op5/merlin/merlind -c /opt/monitor/op5/merlin/merlin.conf


As for CPU usage, it's definitely more lightweight than NDO. A
typical merlin daemon will basically idle away most of its time.
It's the database that does the heavy lifting after all, so it's
not that hard to make merlin itself lean and extremely quick.

As for storage-space, it doesn't use nearly as much as ndoutils
does, since we don't store the entire log and all status updates
in the database, but only the current status and statechanges,
where a statechange is defined as either the state has changed,
or the object went from soft to hard state, which is basically
all we need to make reports look good. Since the logfiles are
already partitioned by date, it was deemed a lot easier to write
a super-fast parser for those instead and make that parser able
to display html output. This is the helper we use in ninja, and
it's working extremely well, showing interesting logdata in a
matter of seconds.

It will grow over time ofcourse, but while NDOUtils' database
can grow to tens of gigabytes in a matter of months for a large
network, merlin stores about 500MiB for a whole year for the same
size network.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios Postemsg

2010-06-11 Thread Max
The main things you will not get from Nagios that you almost always get with
Tivoli:
* High recurring licensing fees
* On-site Tivoli consultants

:)

Nagios does not give you out of the box the visualization dashboards that
Tivoli has but with Nagviz you can you make very nice graphical dashboards
at a much much lower cost to your company.

Nagios also does not do auto-discovery out of the box but there are projects
that give you that capabililty - again at a much lower cost.

Distributed Nagios - there are a few choices, you will need to take the time
to evaluate them all and choose the right one for you, but again, cost will
be lower than Tivoli.

The team I am on is building out a distributed architecture for Nagios based
on our unique requirements - self service model where many SAs can all
change configs on their schedule without our intervention, clustering, fast
redistribution of hosts/services across pollers, centralized transparent (to
the end user) command and control across all pollers.

We are using some existing tools (Nagios and Merlin) and 4 developers and
even then the TCO and maintenance cost will be magnitudes of order cheaper
than Tivoli with much more functionality than most Tivoli shops offer.

A polling model always has some challenges when it comes to scaling big but
compared to Tivoli I think you will find Nagios to be both a lot more fun, a
lot more flexible, a lot better fit, and, if politics don't interfere, your
management should be much more happy with a fixed cost development price tag
than the high $$ open ended maintenance costs of a commercial product like
Tivoli.

- Max

On Fri, Jun 11, 2010 at 10:19 AM, steve f a31mod...@hotmail.com wrote:

  Hello All,

 I am currently looking for an alternative to using Tivoli , TEC  postemsg
 for a rather large ( 6000 + ) remote environment.

 I have had great success with Nagios in my small local/remote test
 environment and the obvious cost savings without having TEC anymore is huge.

 Can I use the existing postemsg tests that are running on the boxes and via
 I guess External Commands have Nagios process the messages?

 For those familiar with both Tivoli  Nagios, Is there anything that Tivoli
 gives me that I cant do with Nagios?  I don't see it if there is.


 Thanks for the help,

 Steve

 --
 The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with
 Hotmail. Get 
 busy.http://www.windowslive.com/campaign/thenewbusy?tile=multicalendarocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5


 --
 ThinkGeek and WIRED's GeekDad team up for the Ultimate
 GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
 lucky parental unit.  See the prize list and enter to win:
 http://p.sf.net/sfu/thinkgeek-promo
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Large Installation

2010-06-11 Thread Max
I can attest  / confirm what Andreas states about the merlin daemon.

BTW, Andreas, I just patched our code base to contain your 0.6.7 changes and
I will be posting that on Github for you and anyone else interested to check
out over the weekend.

Our tests so far are showing that with the Merlin NEB and daemon on a poller
we lose less than 10% capacity on the poller compared to the poller without
the NEB module and Merlind - our test poller is running 10k active services
checks and 1k active host checks in less than 5 minutes with polling
headroom to spare.

- Max
--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Large Installation

2010-06-11 Thread Max
Our changes to Merlin allow N pollers to all write to the same database
without conflicts.
--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Merlin/Ninja

2010-06-11 Thread Giorgio Zarrelli
Well,

Talking about Ninja, I installed on a Debian Lennt box. The  
installation process seemed a bit buggy and I see some problems like  
with scheduling scripts, but I find Ninja a useful tool.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Large Installation

2010-06-11 Thread Kevin Keane
If you aren't actually using the data from NDO, there is little point in 
creating the DB.

I would probably not use NDO to write directly from the satellites. Here is why:


-  Double the network traffic. The satellites have to send check 
results AND database writes.

-  Less reliable. How would you keep the master server from writing the 
same information to the DB that a satellite has just written, and messing up 
the data?

-  NDO can be a serious performance bottleneck; you wouldn't want your 
satellites to be a potential point of failure in terms of performance.

-  If the satellites are behind a firewall, it may not even be possible 
to write directly to the DB.

From: Scott Ward [mailto:13.sward...@gmail.com]
Sent: Friday, June 11, 2010 6:05 AM
To: Nagios Users List
Subject: Re: [Nagios-users] Large Installation

We are going to be using distributed monitoring for sure.  We just cannot 
decide whether we should use NDO to write directly to the database or us NSCA 
to send back to the master server.  Any suggestions?

Is there a frontend that actually uses the information in an NDO db? From what 
I've read it looks like the default Nagios front end uses text files.

~Scott Ward

On Fri, Jun 11, 2010 at 4:48 AM, Martin Melin 
nag...@martinmelin.commailto:nag...@martinmelin.com wrote:
On Thu, Jun 10, 2010 at 21:55, Kevin Keane 
subscript...@kkeane.commailto:subscript...@kkeane.com wrote:
Config file maintenance can be improved to some extent with careful design of 
the config files, as well as tools. It is an issue that I am running into with 
a relatively small installation with 80+ hosts and 400+ services. My 
installation is highly heterogeneous and very dynamic, which makes config file 
maintenance a nightmare. Having to restart Nagios after a configuration change 
doesn't help either. On the other hand, a network with 2000 identical machines 
is probably going to be much easier to manage than my type of network.
Nitpicking or helpful tip, you decide: Nagios reloads config changes on SIGHUP, 
you don't have to do a restart. A full restart can take a while on a 
sufficiently sized installation so having to do one for every change would 
indeed be a PITA, but I've never seen a reload take more than a few seconds.

Cheers
Martin

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.netmailto:Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null