There is an assumption in gmond (pre_process_node()) that all nodes that
participate in the multicast should have resolvable hostnames on EVERY
node.

This is flawed in that in the common case the master node of a cluster is
NOT a full-blown dns server.  And the slaves know nothing about the
external network that the master node is connected to.

But this is just a sideeffect of the fact that when gmond is started on
the master with 2 interfaces (eth0 - cluster network, eth1 - external
network) gmond uses the hostname of eth1 even though I start gmond with:

gmond --mcast_if eth0

Now, eth1 is the default route for the master; and is probably why gmond
is using that hostname?

The work around that I'm using right now is to start gmond on the master
as muted, i.e.: gmond --mcast_if eth0 --mute

if I don't all slave nodes get a ton of messages in /var/log/messages like:
Jan  2 02:32:36 fire30 /usr/sbin/gmond[1501]: gethostbyaddr error:
(remote_ip=192.168.254.141) A temporary error occured on an authoritative
name server. 

NOTE: eth0 on master is 192.168.0.1
      eth1 on master is 192.168.254.141
      
so why if I start with --mcast_if eth0 on master does gmond grab the
hostname from eth1? 

Ideas?

Mike

Reply via email to