Hi,

I have installed Ganglia for the first time on our smallish cluster. I have installed the following x86_64 RPMS:
ganglia-gmetad-3.0.1-1, ganglia-gmond-3.0.1-1, ganglia-web-3.0.1-1

The initial site is running at: http://auriga.qut.edu.au/ganglia/

I am trying to configure ganglia to have one 'cluster' as the the head node (called 'Head of Auriga'). This node also has gmetad and the web interface running on it. I am then trying to have the ten remaining nodes in another cluster (called 'Auriga'). These nodes do get rebooted and rebuilt often. So I would like to have some type of fall back so if a node is rebooted the data is not lost.

Atm I am getting an 'unspecified' cluster. Which contains some hosts. This first happened when node001 was rebooted.

I have setup of the following on Auriga (the head node):
/etc/gmond.conf (most of this file is unchanged):
cluster {
 name = "Head of Auriga"
}
udp_send_channel {
 mcast_join = 239.2.11.71
 port = 8649
}
udp_recv_channel {
 mcast_join = 239.2.11.71
 port = 8649
 bind = 239.2.11.71
}
tcp_accept_channel {
 port = 8649
}

/etc/gmetad.conf:
data_source "nodes" node001 node002 node003 node004 node005 node006 node007 node008 node009 node010
data_source "head" auriga.qut.edu.au
gridname "Auriga Cluster"
trusted_hosts 127.0.0.1


The following file is on every other node (only running gmond):
cluster {
 name = "Auriga"
}
udp_send_channel {
 mcast_join = 239.2.11.71
 port = 8649
}
udp_recv_channel {
 mcast_join = 239.2.11.71
 port = 8649
 bind = 239.2.11.71
}
tcp_accept_channel {
 port = 8649
}


Can someone please help me configure ganglia so that the cluster of nodes reports properly.

Thanks,
Ashley

--
Ashley Wright
3864 9264
[EMAIL PROTECTED]
HPC and Research Support Group
Queensland University of Technology (QUT)


Reply via email to