matt massie ([EMAIL PROTECTED]) said:

> mike-
> 
> if you want to look the code.. ./lib/net.c is the file and mcast_connect() 
> and mcast_set_if() are the functions.  if the interface is set to NULL, i 
> set inaddr.s_addr = htonl ( INADDR_ANY ) otherwise i do the ioctl call.
> 
> you might want to look at the ifconfig -a output to see if eth0 is UP.. 
> also.. the INADDR_ANY means that the kernel decides which interface to 
> use.  what does your netstat -rn look like?  i haven't had this problem on 
> our cluster .. and it hasn't been reported by anyone else so i'm trying to 
> see if it's something unique to your configuration.

As it turns out, eth1 is configured via dhcp; and it becomes the default
route for the master node.  This was determined via netstat -rn

So I can only imagine that IPADDR_ANY will cause the default route
interface to be returned, as opposed to picking the first interface.  

As a side-note, regardless of which interface I have gmond multicast on;
the rrd-php client resolves the master node's hostname to be:
dyn180.plogic.internal  Which _should_ only be resolved as such the IP
of the interface gmond is using believes it's IP is dyn180 (eth1's
IP).. so that is pretty odd that it resolves to dyn180 regardless of the
fact that I specify eth0 (which would resolve to norfolk1). 

> we'll get this problem fixed i'm sure.  good luck.

Yeap; it's very subtle; I don't have time to dig into the code but will
be able to early next week.  

As a side note, I found another problem with the gmonds from each
node in the cluster not expiring their timeouts for all metric transmition
when a node a particular node restarts it's gmond.  That is the restarted
gmond doesn't _know_ about certain less frequent metric updates from other
nodes.  I have more details, I'm not sure if this is related to the
default route interface problem I just got into, but I'll get back with
you once I can devote more time to actually digging deep with my issues.  

Mike

> Yesterday, Mike Snitzer wrote forth saying...
> 
> > matt massie ([EMAIL PROTECTED]) said:
> > 
> > > look at the output of "ifconfig -a".. the first interface listen will be 
> > > the interface that gmond communicates on.
> > 
> > Ok, well if that is the case; that too goes against what is happening on
> > this particular node; because ifconfig -a lists eth0 as the first
> > interface; yet gmond is multicasting on eth1 by default.
> > 
> > Mike
> > 
> > 
> > > Today, Mike Snitzer wrote forth saying...
> > > 
> > > > All,
> > > > 
> > > > While getting ganglia 2.2.1 going on a cluster I noticed gmond -h 
> > > > stated:
> > > > 
> > > >  -i, --mcast_if
> > > >            set the interface gmond is to multicast on
> > > >            default: first interface e.g. "eth0"
> > > > 
> > > > this however does not appear to be the case; as the multicast was going
> > > > out eth1.  So I was only seeing the master node in the php-rrd-client.
> > > > 
> > > > As soon as I used: gmond -i eth0  all the nodes in the cluster were
> > > > viewable through the php-rrd-client.
> > > > 
> > > > I've yet to get around to hacking the gmond source; but figured
> > > > I'd first mail the list to see if others have seen eth0 not being used 
> > > > as
> > > > the default multicast interface.
> > > > 
> > > > Thanks,
> > > > Mike
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Ganglia-general mailing list
> > > > [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/ganglia-general
> > > > 
> > > 
> > 
> 

Reply via email to