On Tue, 24 Sep 2002, Federico Sacerdoti wrote:
> Just by checking the number of disk interrupts we knew disk I/O was a 
> problem, not to mention the inconsistant-looking graphs. When we put the 

At the moment disk I/O isn't the problem, though I see how it could be 
once the rest of the systems get added.  What method are you using for 
backing up the rrd's from tmpfs?  rsync? 

> > I really hope you aren't mixing Linux, FreeBSD and IRIX nodes *WITHIN* 
> > the same cluster.  

That's _precisely_ what I'm doing.  

The compute clusters are linux nodes, FreeBSD gateways.  Then the network
where I'm having trouble is the workstation network for the engineers,
which is a grab bag of 32bit, 64bit, IRIX/Linux/*BSD (one of the things I
want to help with asap is getting OpenBSD to build).

I don't quite understand why this is (or needs be) a problem.  Shouldn't
the gmonds just hash and multicast all the metrics they receive,
regardless of whether it is a metric that its own host is capable of
storing.  It seemed to work this way in principle, with 2.4.1.  I have a
set of custom metrics (see my topusers.pl in the gmetric scripts) that are
per users, and thus by nature not on each machine.  These for the most
part work great... it is a really good way to see usage patterns across
the network and to pin resource usagee on the users responsible in a graph
that the managers can understand.  I used to use nasty hackish perl
scripts to create graphs from sar reports, which were never as accurate
anyway.  I much prefer ganglia in this regard.

-ryan

-- 
Ryan Sweet <[EMAIL PROTECTED]>
Atos Origin Engineering Services
http://www.aoes.nl


Reply via email to