2009/6/22 Ben <[email protected]>
>
> Hi Lindsay,
>
> Thanks for that comprehensive answer.
>
> So collectd runs on each system itself, but I assume Nagios is centralised
> at some point, so where would be the most sensible place to do that? Is
> there ultra reliable hosting built for just that purpose?

I second Lindsay's recommendations for both - we played around with a
few "data collection" tools (mostly MRTG, but also tried cacti and
others) and settled on collectd (which I first saw in action in a SLUG
meeting). Nagios was always used, and version 3 brought some sanity to
its configuration.

Collectd runs on each system and sends the data to a concentrator for
easy graphing and searching.

Nagios generally activelly probes the monitored systems. There is a
Nagios NRPE agent
(http://nagios.sourceforge.net/docs/3_0/addons.html#nrpe) which can be
used to add local checks.

We currently monitor over 100 hosts with over 200 services on them.
I'd say about half of them are virtual servers and growing.

I'd also suggest that you consider introducing a configuration control
system in an early stage. We use puppet not just to setup servers but
also the monitoring systems and the infrastructure to kickstart the
server setup (we use CentOS 5, which comes with Xen 3.0 with
backports). It's an iterative process and sometimes frustrating when
you have legacy systems to take care of, but the rewards are enormous.

Cheers,

--Amos
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to