Hi All,
I have gaglia installed on a cluster [1] with 384 nodes and I just
upgraded from version 2.5.7 to version 3.0. I have two nodes running
gmetad (node247 and node248) and now I cannot see these nodes on the web
front-end and in the XML output (also rrd files are not upgraded).
The only difference in node and gmond configuration is the multicast
interface:
# diff gmond.conf gmond.conf.node247
22c22
< mcast_if = "eth0"
---
> mcast_if = "eth1"
30c30
< mcast_if = "eth0"
---
> mcast_if = "eth1"
The part about multicast is the following:
/* channel to send multicast on mcast_channel:mcast_port */
udp_send_channel {
mcast_join = "239.2.11.71"
port = "8649"
mcast_if = "eth0"
}
/* channel to receive multicast from mcast_channel:mcast_port */
udp_recv_channel {
mcast_join = "239.2.11.71"
port = "8649"
bind = "239.2.11.71"
mcast_if = "eth0"
}
/* channel to export xml on xml_port */
tcp_accept_channel {
port = "8649"
All other nodes are listed, i.e.:
# telnet localhost 8649 | grep node246
<HOST NAME="node246.clx.cineca.it" IP="10.10.12.246"
REPORTED="1108568138" TN="10" TMAX="20" DMAX="3600"
LOCATION="unspecified" GMOND_STARTED="1108499041">
But nothing about node247 and node248. I also checked with tcpdump and
others nodes in the cluster do not see multicast from node247 and
node248.
node001:~ # tcpdump -i eth0| grep node246
tcpdump: listening on eth0
16:51:59.126916 node246.clx.cineca.it.32774 > 239.2.11.71.8649: udp 8 (DF) [ttl
1]
16:51:59.126959 node246.clx.cineca.it.32774 > 239.2.11.71.8649: udp 8 (DF) [ttl
1]
16:51:59.127120 node246.clx.cineca.it.32774 > 239.2.11.71.8649: udp 8 (DF) [ttl
1]
node001:~ # tcpdump -i eth0| grep node246
tcpdump: listening on eth0
I miss something or is a bug?
[1] SUSE SLES 8 SP3 - kernel 2.4.21-266-smp - rrdtool 1.0.39-168
Best Regards
--
Andrea Capriotti
System Management Group - Cineca - www.cineca.it
[EMAIL PROTECTED] - Tel +39 051 6171890