Thanks Paul! I did a test today by removing (commenting out) all but one of my larger clusters that was totally useless.... the result was surprising, everything is fine. Clearly gmetad is the problem, now, whether this is the result of some OS performance issue, such as an IO bottleneck, or in code I don't know. Based on Ganglia's history I tend to think this is likely my own performance issue.
If I get things sorted out I'll report back, time for DTrace to charge to the rescue yet again. ;) benr. Paul Choi wrote: > We use unicast here and we have over 160 hosts monitored, distributed > among 17 clusters of different sizes. We haven't seen scalability issues > having to do with unicast. (but I did move the rrds to tmpfs to lower > the load on the host because the host does other things) > > I've seen some hosts flap up and down sometimes because of invalid > entries in /etc/hosts. > > > Ben Rockwood wrote: >> I'm having some trouble with my unicast Ganglia setup. I have a single >> gmetad watching 8 different clusters, clusters range from 5 nodes to 55 >> nodes. Small clusters operate as expected. Larger clusters of 20 nodes >> upwards flag a lot of nodes down. >> >> The strange thing is that if you just refresh the gmeta cluster page >> you'll see nodes going up and down in a sine fashion... 15 down 10 up, >> refresh, 14 down 11 up, refresh, 13 down 12 up, [...]. >> >> I'm thinking that I've hit a scaling threshold but none of the >> processes, gmetad or gmond's, seem to struggle much. I've tuned the >> server thread count on gmetad up. In all the setups there is a single >> unicast head node to which all the other gmond's in the cluster report >> to and is poked by gmetad. >> >> I saw an issue like this in the archives several years ago (something >> about TN) but nothing seemed helpful. >> >> Can anyone offer suggestions on where to look to better understand this >> issue or share similar experiences? >> >> benr. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

