Hi,
I have installed Ganglia for the first time on our smallish cluster. I
have installed the following x86_64 RPMS:
ganglia-gmetad-3.0.1-1, ganglia-gmond-3.0.1-1, ganglia-web-3.0.1-1
The initial site is running at: http://auriga.qut.edu.au/ganglia/
I am trying to configure ganglia to have one 'cluster' as the the head
node (called 'Head of Auriga'). This node also has gmetad and the web
interface running on it.
I am then trying to have the ten remaining nodes in another cluster
(called 'Auriga'). These nodes do get rebooted and rebuilt often. So I
would like to have some type of fall back so if a node is rebooted the
data is not lost.
Atm I am getting an 'unspecified' cluster. Which contains some hosts.
This first happened when node001 was rebooted.
I have setup of the following on Auriga (the head node):
/etc/gmond.conf (most of this file is unchanged):
cluster {
name = "Head of Auriga"
}
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8649
}
/etc/gmetad.conf:
data_source "nodes" node001 node002 node003 node004 node005 node006
node007 node008 node009 node010
data_source "head" auriga.qut.edu.au
gridname "Auriga Cluster"
trusted_hosts 127.0.0.1
The following file is on every other node (only running gmond):
cluster {
name = "Auriga"
}
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8649
}
Can someone please help me configure ganglia so that the cluster of
nodes reports properly.
Thanks,
Ashley
--
Ashley Wright
3864 9264
[EMAIL PROTECTED]
HPC and Research Support Group
Queensland University of Technology (QUT)