Re: [Ganglia-general] Kernel Upgrade killed my happy Ganglia

Steven Wagner Thu, 19 Dec 2002 10:46:49 -0800

Phil Forrest wrote:

Hello All,
Once upon a time, I had a happy ganglia monitor that was giving mevaluable data on all nodes of my 48 node cluster. Then I got a requestfrom a user to upgrade the kernel. After I upgraded the kernels acrossthe cluster, my ganglia could only see the data from the gmond runningon the head node (which also had gmetad and httpd running).
The cluster is running Red Hat 7.3 with kernel 2.4.9-34smp #1 SMP SatJun 1 05:54:57 EDT 2002 i686 unknown
My cluster has 46 compute nodes with one (eth0) interface and two headnodes with two interfaces (eth0 and eth1) one for the private lan andone for the campus network. My head node that has gmetad running has"mcast_if eth1" set in its gmond.conf file. Here's the /sbin/ifconfigslice for eth1 on the head node:
eth1      Link encap:Ethernet  HWaddr 00:40:F4:2A:6E:26
          inet addr:192.168.5.200  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:176581970 errors:0 dropped:0 overruns:0 frame:0
          TX packets:160905314 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0
          RX bytes:1187468116 (1132.4 Mb)  TX bytes:2350492219 (2241.6 Mb)
Can I trust the output of /sbin/ifconfig (meaning, if /sbin/ifconfigsays MULTICAST is running, is that the REAL truth, or can the kernelstill suppress multicast transmissions??)

The kernel's firewalling configuration can still filter out multicasttraffic. Check your firewall config (man iptables :) ). If your config isvery restrictive, at least poke a li'l hole for the multicast IP/port combo.

IIRC, the default iptables behavior changed a few point releases back inRedhat - it's now on. This is apparently to help everyone who's installingit on their desktop connected to the net via cable modem from getting owned...

Also, gmetad cares not one whit about /etc/gmond.conf. I just did aonce-over on the code to make absolutely sure, there's no mention of it.It's /etc/gmetad.conf that you should concern yourself with on the headunits if you're having display problems. Unless they're also supposed tobe part of the cluster, in which case you would configure the gmondsseparately.

Remember to open firewall ports for TCP port 8649 on hosts running themonitoring core and TCP port 8651 for the hosts running gmetad.

The metadaemon should be determining the path to establish its connectionsvia the good ol' fashioned kernel routing table, just like anything else.

As a test, I've been running gmond on one node in deaf debug mode, andon another node in mute debug mode. The deaf one is pumping out datasuccessfully and the mute one is not seeing anything. Since this iscompute node to compute node, there can only be one interface (eth0).There has to be something in the kernel config that is screwing this up.

That sounds like it's a firewall config issue or a router/switch configissue to me...

I'm wondering with all the kernel upgrades going on out there, maybesomeone has had similar issues? Thanks in advance for any info!

7.2 / 2.4.19smp on most of our nodes here, no reported problems with themonitoring core on any of them.

Happy Holidays To All,
-Phil Forrest


Yeah, happy Life Day, kids. ;)

Hope this info proves useful...

Re: [Ganglia-general] Kernel Upgrade killed my happy Ganglia

Reply via email to