On Wed, 14 May 2003, Douglas Eadline wrote: > I just upgraded ganglia to version 2.5.3. on a 300 node cluster I have > seen gmetad slowly load the monitor node so that it is almost useless. > i.e. load of about 8-9 or higher, when I shutdown gmetad > everything returns to normal > > Has anyone seen anything like this?
I've seen the same thing when tracking upwards of 400 CPUs. It does a bit better and the machine is actually useful otherwise if I use a "flat" scheme, where all the clusters report to one gmetad/web frontend. If I arrange a few gmetads hierarchically and have them all report to this single gmetad, the machine starts dropping data and the problems become more severe. When I say problems, I mean lack of responsiveness to interactive processes. Some things will just stall for a few seconds frequently, and I'm not sure if it's locking contention or what yet. > The nodes and monitoring node are P3-1.26 (dual) with 2GB memory. Dual XP2000+, 4GB RAM. Imagine if I tried using the dual PPro-200 that I have just sitting around instead. :) Ken -- Ken MacInnis - kmacinni at umich dot edu

