Your gmetad data_source should not be trying to talk to every gmond in your cluster. In multicast mode, which is the default mode, every gmond talks to every other gmond and stores metrics for the entire cluster. What this means is that gmetad only need to ping one gmond node to get the metrics for the entire cluster. The only reason to include more than one IP address in the data_source is for fail over purposes. If the primary gmond goes down, then the secondary data_source will pick up and report for the rest of the cluster. Even in unicast mode where all of the gmond nodes talk to a single gmond node rather than to every other node, your data_source should only be referencing the master gmond node.
Brad >>> On 11/26/2008 at 6:51 AM, in message <[EMAIL PROTECTED]>, Johann Spies <[EMAIL PROTECTED]> wrote: > I do not have a lot of success in getting Ganglia to work on cluster > of 22 computers. > > It is an OpenSuse system and I have installed gmond on all the nodes > and it is running an all the nodes. > > On the head-node I have the following in /etc/ganglia/metad.conf: > > data_source "Ratashasta" localhost:8649 192.168.129.2:8649 > 192.168.129.3:8649 192.168.129.4:8649 \ > 192.168.129.5:8649 192.168.129.6:8649 192.168.129.7:8649 > 192.168.129.8:8649 192.168.129.9:8649 \ > 192.168.129.10:8649 192.168.129.11:8649 192.168.129.12:8649 \ > 192.168.129.13:8649 192.168.129.14:8649 192.168.129.15:8649 \ > 192.168.129.16:8649 192.168.129.17:8649 192.168.129.18:8649 \ > > and > > trusted_hosts 127.0.0.1 192.168.129.2 192.168.129.3 192.168.129.4 \ > 192.168.129.5 192.168.129.6 192.168.129.7 192.168.129.8 192.168.129.9 \ > 192.168.129.10 192.168.129.11 192.168.129.12 \ > 192.168.129.13 192.168.129.14 192.168.129.15 \ > 192.168.129.16 192.168.129.17 192.168.129.18 \ > 192.168.129.19 192.168.129.20 192.168.129.21 192.168.129.22 > > When I run sudo gstat -a > The result is: > > > CLUSTER INFORMATION > Name: Ratashta > Hosts: 1 > Gexec Hosts: 0 > Dead Hosts: 0 > Localtime: Wed Nov 26 15:48:15 2008 > > CLUSTER HOSTS > Hostname LOAD CPU > Gexec > CPUs (Procs/Total) [ 1, 5, 15min] [ User, Nice, System, > Idle, Wio] > > head001.sun.ac.za > 0 ( 0/ 289) [ 0.04, 0.31, 0.52] [ 0.1, 0.0, 0.0, 99.4, > 0.4] OFF > > I have changed the firewall through yast to allow udp traffic from > 146.232.128.108/32 on port 8649 as well as from 192.168.129.0/24 (the > nodes on internal network. > > I have gmond running an all the nodes. > > I don't see that IPTABLES is blocking communication between the > head-node and the other and I don't know what to look for next. > > Any help will be appreciated. > > > Regards > Johann ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

