Hi Matt,
Here's the udp_send_channel entry in my gmond.conf file:
udp_send_channel {
mcast_join = 239.2.11.71
mcast_if = en0
port = 8649
host = 192.168.12.200
}
It's the same for all 32 nodes, including the cluster's master node which
runs both gmond and gmetad. All cluster nodes communicate on the private
network using the 'en0' interface. The master node uses 'en1' to
communicate with the external network. The internal nodes also have an
'en1' interface but it is unused.
All 32 nodes have an /etc/hosts file with an identical mapping of private
IPs with hostnames as well as relavent external machines available through
a NAT server running on the master node. When I switch from Ganglia
version 3.0.0 back to version 2.5.6 then I do see hostnames in the gmond
XML output, so I'm certain the hostname's are resolvable.
I forgot to mention in my original posting that the master node never
appears in the list of hosts. The master node has an IP address
192.168.12.200 (the same as the host in the udp_send_channel entry above),
but I never see this host in the XML output. The master node appears to
be communicating over the mcast channel because if I to a 'telnet
localhost 8649', but I never see metrics for the master node.
Thanks for your help,
Stephen
On Fri, 18 Feb 2005, Matt Massie wrote:
> can you please show your udp_send_channel entries? are you specifying
> the correct interface in them?
>
> the gmond that receives the message tries to resolve the source address
> of the incoming packet that it receives. the source address if
> dependent on which interface that data was sent on (an ifconfig -a will
> show you the local addresses associated with each interface).
>
> the problem is if your resolver can resolve the ip address on a
> particular interface (e.g. it can resolve public ip addresses but there
> are no entries for private ip addresses).
>
> -matt
>
> Stephen Ficklin wrote:
> > I've installed Ganglia 3.0.0 on a 32 node cluster of XServe G5's with
> > Darwin Kenel 7.8.0 on master node and 7.7.1 on interior nodes. The
> > metrics are much better with version 3.0.0 (which I'm very grateful for)
> > but I have a couple problems I was hopeing someone might have some insight
> > about.
> >
> > Problem 1: When running the configure script I get the following message
> > (gcc version is 3.3):
> >
> > checking net/if.h presence... yes
> > configure: WARNING: net/if.h: present but cannot be compiled
> > configure: WARNING: net/if.h: check for missing prerequisite headers?
> > configure: WARNING: net/if.h: proceeding with the preprocessor's result
> > configure: WARNING: ## ------------------------------------ ##
> > configure: WARNING: ## Report this to [EMAIL PROTECTED] ##
> > configure: WARNING: ## ------------------------------------ ##
> >
> > The error messages in config.log are these:
> > configure:19987: checking net/if.h usability
> > configure:20000: gcc -c -g -O2 conftest.c >&5
> > In file included from configure:20069:
> > /usr/include/net/if.h:186: error: field `ifru_addr' has incomplete type
> > /usr/include/net/if.h:187: error: field `ifru_dstaddr' has incomplete type
> > /usr/include/net/if.h:188: error: field `ifru_broadaddr' has incomplete type
> > /usr/include/net/if.h:219: error: field `ifra_addr' has incomplete type
> > /usr/include/net/if.h:220: error: field `ifra_broadaddr' has incomplete type
> > /usr/include/net/if.h:221: error: field `ifra_mask' has incomplete type
> > /usr/include/net/if.h:290: error: field `addr' has incomplete type
> > /usr/include/net/if.h:291: error: field `dstaddr' has incomplete type
> > configure:20003: $? = 1 configure:
> > failed program was:
> > | #line 19989 "configure"
> > | /* confdefs.h. */
> >
> > ...
> >
> > I'm not sure if these errors have any bearing on the other problem I
> > have. Despite the configuration errors, I'm still able to compile and run
> > both gmond and gmetad.
> >
> >
> > --
> > Problem 2: The gmond daemons do not return the hostname, but rather an
> > IP address. Here's an example from the XML output (telnet localhost 8649):
> >
> > <HOST NAME="192.168.12.220" IP="192.168.12.220" REPORTED="1108749163"
> > TN="19" TMAX="20" DMAX="0" LOCATION="unspecified"
> > GMOND_STARTED="1108747603">
> >
> > If I switch back to version 2.5.6 then the hostnames appear fine. In any
> > event, the hostnames are resolvable through the /etc/hosts file and a DNS
> > server that serves the cluster.
> >
> >
> > Thanks for any help,
> > Stephen
> >
> >
> > -------------------------------------------------------
> > SF email is sponsored by - The IT Product Guide
> > Read honest & candid reviews on hundreds of IT Products from real users.
> > Discover which products truly live up to the hype. Start reading now.
> > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> > _______________________________________________
> > Ganglia-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
> --
> PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'
>
> They that can give up essential liberty to obtain a little
> temporary safety deserve neither liberty nor safety.
> --Benjamin Franklin, Historical Review of Pennsylvania, 1759
>