Hello All,
Once upon a time, I had a happy ganglia monitor that was giving me valuable
data on all nodes of my 48 node cluster. Then I got a request from a user
to upgrade the kernel. After I upgraded the kernels across the cluster, my
ganglia could only see the data from the gmond running on the head node
(which also had gmetad and httpd running).
The cluster is running Red Hat 7.3 with kernel 2.4.9-34smp #1 SMP Sat Jun 1
05:54:57 EDT 2002 i686 unknown
My cluster has 46 compute nodes with one (eth0) interface and two head
nodes with two interfaces (eth0 and eth1) one for the private lan and one
for the campus network. My head node that has gmetad running has
"mcast_if eth1" set in its gmond.conf file. Here's the /sbin/ifconfig
slice for eth1 on the head node:
eth1 Link encap:Ethernet HWaddr 00:40:F4:2A:6E:26
inet addr:192.168.5.200 Bcast:192.168.5.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:176581970 errors:0 dropped:0 overruns:0 frame:0
TX packets:160905314 errors:0 dropped:0 overruns:0 carrier:0
collisions:0
RX bytes:1187468116 (1132.4 Mb) TX bytes:2350492219 (2241.6 Mb)
Can I trust the output of /sbin/ifconfig (meaning, if /sbin/ifconfig says
MULTICAST is running, is that the REAL truth, or can the kernel still
suppress multicast transmissions??)
As a test, I've been running gmond on one node in deaf debug mode, and on
another node in mute debug mode. The deaf one is pumping out data
successfully and the mute one is not seeing anything. Since this is compute
node to compute node, there can only be one interface (eth0). There has to
be something in the kernel config that is screwing this up.
I'm wondering with all the kernel upgrades going on out there, maybe
someone has had similar issues? Thanks in advance for any info!
Happy Holidays To All,
-Phil Forrest
Phil Forrest
334-844-6910
Auburn University Dept. of Physics
Network & Scientific Computing
207 Leach Science Center