On Tue, 24 Sep 2002, Federico Sacerdoti wrote: > Just by checking the number of disk interrupts we knew disk I/O was a > problem, not to mention the inconsistant-looking graphs. When we put the
At the moment disk I/O isn't the problem, though I see how it could be once the rest of the systems get added. What method are you using for backing up the rrd's from tmpfs? rsync? > > I really hope you aren't mixing Linux, FreeBSD and IRIX nodes *WITHIN* > > the same cluster. That's _precisely_ what I'm doing. The compute clusters are linux nodes, FreeBSD gateways. Then the network where I'm having trouble is the workstation network for the engineers, which is a grab bag of 32bit, 64bit, IRIX/Linux/*BSD (one of the things I want to help with asap is getting OpenBSD to build). I don't quite understand why this is (or needs be) a problem. Shouldn't the gmonds just hash and multicast all the metrics they receive, regardless of whether it is a metric that its own host is capable of storing. It seemed to work this way in principle, with 2.4.1. I have a set of custom metrics (see my topusers.pl in the gmetric scripts) that are per users, and thus by nature not on each machine. These for the most part work great... it is a really good way to see usage patterns across the network and to pin resource usagee on the users responsible in a graph that the managers can understand. I used to use nasty hackish perl scripts to create graphs from sar reports, which were never as accurate anyway. I much prefer ganglia in this regard. -ryan -- Ryan Sweet <[EMAIL PROTECTED]> Atos Origin Engineering Services http://www.aoes.nl
