Your gmetad data_source should not be trying to talk to every gmond in your 
cluster.  In multicast mode, which is the default mode, every gmond talks to 
every other gmond and stores metrics for the entire cluster.  What this means 
is that gmetad only need to ping one gmond node to get the metrics for the 
entire cluster.  The only reason to include more than one IP address in the 
data_source is for fail over purposes.  If the primary gmond goes down, then 
the secondary data_source will pick up and report for the rest of the cluster.  
Even in unicast mode where all of the gmond nodes talk to a single gmond node 
rather than to every other node, your data_source should only be referencing 
the master gmond node.  

Brad

>>> On 11/26/2008 at 6:51 AM, in message <[EMAIL PROTECTED]>, Johann
Spies <[EMAIL PROTECTED]> wrote:
> I do not have a lot of success in getting Ganglia to work on cluster
> of 22 computers.
> 
> It is an OpenSuse system and I have installed gmond on all the nodes
> and it is running an all the nodes.
> 
> On the head-node I have the following in /etc/ganglia/metad.conf:
> 
> data_source "Ratashasta" localhost:8649 192.168.129.2:8649 
> 192.168.129.3:8649 192.168.129.4:8649 \
>  192.168.129.5:8649 192.168.129.6:8649 192.168.129.7:8649  
> 192.168.129.8:8649 192.168.129.9:8649 \
>  192.168.129.10:8649 192.168.129.11:8649 192.168.129.12:8649 \
> 192.168.129.13:8649 192.168.129.14:8649 192.168.129.15:8649 \
> 192.168.129.16:8649 192.168.129.17:8649 192.168.129.18:8649 \
> 
> and
> 
> trusted_hosts 127.0.0.1 192.168.129.2 192.168.129.3 192.168.129.4 \
> 192.168.129.5 192.168.129.6 192.168.129.7 192.168.129.8 192.168.129.9 \
>  192.168.129.10 192.168.129.11 192.168.129.12 \
> 192.168.129.13 192.168.129.14 192.168.129.15 \
> 192.168.129.16 192.168.129.17 192.168.129.18 \
> 192.168.129.19 192.168.129.20 192.168.129.21 192.168.129.22
> 
> When I run  sudo gstat -a
> The result is:
> 
> 
> CLUSTER INFORMATION
>        Name: Ratashta
>       Hosts: 1
> Gexec Hosts: 0
>  Dead Hosts: 0
>   Localtime: Wed Nov 26 15:48:15 2008
> 
> CLUSTER HOSTS
> Hostname                     LOAD                       CPU
> Gexec
>  CPUs (Procs/Total) [     1,     5, 15min] [  User,  Nice, System,
>  Idle, Wio]
> 
> head001.sun.ac.za
>     0 (    0/  289) [  0.04,  0.31,  0.52] [   0.1,   0.0,   0.0, 99.4,   
> 0.4] OFF
> 
> I have changed the firewall through yast to allow udp traffic from
> 146.232.128.108/32 on port 8649 as well as from 192.168.129.0/24 (the
> nodes on internal network.
> 
> I have gmond running an all the nodes.
> 
> I don't see that IPTABLES is blocking communication between the
> head-node and the other and I don't know what to look for next.
> 
> Any help will be appreciated.
> 
> 
> Regards
> Johann





-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to