Hi folks, I don't know if I'm just trying to push Ganglia to more than it can handle or if I'm doing something wrong, but no matter how I design my Ganglia structure, gmetad seems to always crush the machine where it runs. Here's an overview of my environment:
Ganglia 2.5.4 All hosts involved are running RedHat 7.2 RRDtool version 1.0.45 I have 16 subnets, each with 200 machines give or take a few. I estimate around 3000 nodes total. Some of these are dual P3, some are single P4, and a few random Xeon and Itanium nodes. Every node is running gmond, and that's running fine. Each subnet has a "master" node that is a dual P3 1.3GHz. This box provides DNS, NIS, and static DHCP for the subnet. Normal load on these machines is very, very minimal. My first attempt was to set up a single dedicated Ganglia machine running gmetad, Apache, and the web frontend. In this machine's gmetad.conf file, I listed each of the "master" nodes in the subnets as data sources. I thought having one box collect all the data and store the RRD files would be great. Well, this was a bad idea...the box (a P4 with 2GB RAM) was absolutely crushed...load shot up to 8.5, and all the graphs continually had gaps in them. So my next attempt to was to install gmetad on each of the "master" nodes. I would have this gmetad collect data for the subnet, and then run another gmetad on my Ganglia web machine to just talk to these 16 other gmetads. I don't really like having to now backup 16 machines, but I've had problems before with trying to store RRD files on an NFS mount, so I decided not to try that. This isn't working all that great, either...the gmetad on these "master" nodes (collecting data from ~200 hosts each) is also causing a pretty high load...the boxes now stay around 2-3 load points all the time and sometimes slows down other operations on the box. Am I doing something wrong, or is gmetad really this much of a resource hog? Anyone else trying to use Ganglia to monitor 3000 machines? Am I asking too much? Thanks for any insight. Steve Gilbert Unix Systems Administrator [EMAIL PROTECTED]

