Hi, I'm wondering what the suggested setup is for a large Grid. I'm having trouble with scalaing ganglia to work on large clusters.
Consider the following: - Pretty well default gmond.conf distributed throughout all cluster members. - 20 clusters of 64 nodes. Gmond running on each cluster node, plus on the cluster head node - gmond and gmetad running on the "admin" node, which has the Grid defined in gmetad, and polls the information from each of the cluster head nodes. The configuration of gmetad has been modified to store the rrds in /dev/shm, but this directory gets very large so I'd like to move away from that. Switching the rrd directory back to the default breaks. As soon as gmetad gets through its first round of grabbing metrics from the head nodes, the machine starts writing a lot of small updates to disk, completely consuming the machine: procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 3 0 17140 641292 453944 2009608 0 0 0 0 5092 20394 19 27 55 0 0 3 0 17140 632364 453952 2009600 0 0 0 0 6358 17128 24 32 44 0 0 1 0 17140 631744 453952 2009600 0 0 0 0 2579 7545 7 11 82 0 0 0 0 17140 629264 453952 2009600 0 0 0 0 2099 11337 7 7 86 0 0 0 0 17140 629264 453952 2009600 0 0 0 0 351 855 0 0 100 0 0 0 1 17140 629264 453952 2009600 0 0 0 3456 986 793 0 0 59 41 0 0 1 17140 629264 453952 2009600 0 0 0 3332 1159 897 0 0 50 50 0 0 1 17140 629280 453952 2009600 0 0 0 3072 1019 814 0 0 50 50 0 0 1 17140 629280 453952 2009600 0 0 0 1792 771 886 0 0 50 50 0 0 1 17140 629280 453952 2009600 0 0 0 1284 588 761 0 0 50 50 0 0 2 17140 629280 453952 2009600 0 0 0 1536 676 890 0 0 38 61 0 0 2 17140 629296 453952 2009600 0 0 0 1280 613 763 0 0 50 50 0 0 2 17140 629296 453952 2009600 0 0 0 2048 825 887 0 0 50 50 0 forever more... Is there a way that I should be architecting the configuration files to make ganglia scale to work on this cluster? I think I want to run gmetad on each head node, and to use that RRD data without regenerating it on the admin node. Is that possible? Thanks, mh ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

