stephen-

i think i know what you problem is. here is some pseudo-code for how gmond creates the udp_send_channels that you specify... i just reread the man page (man gmond.conf) and i don't think it's clear enough about this and needs to be updated...

if( mcast_join )
  {
   /* We'll be listening on a multicast channel */
   socket = create_mcast_client(pool, mcast_join, port, ttl);
  }
else
  {
    /* Create a UDP socket */
    socket = create_udp_client( pool, host, port );
  }

the host entry in your udp_send_channel is being completely ignored.

if you want to send data to a specific host just create a second udp_send_channel entry... for example

/* start config example */
udp_send_channel {
  mcast_join = 239.2.11.71
  mcast_if   = en0
  port       = 8649
  bind       = 239.2.11.71 /* smart to do see man page */
}

udp_send_channel {
  host = 192.168.12.200
  port = 8649
}
/* end config example */

having these two send channels will cause gmond to metric data on the multicast channel and a unicast message to the host you specified (192.168.12.200).

let me know if this helps. btw, the code for setuping up the udp_send_channel is at ./lib/libgmond.c at line 537 function Ganglia_udp_send_channels_create().

good luck!
-matt


Stephen Ficklin wrote:
Hi Matt,

Here's the udp_send_channel entry in my gmond.conf file:

udp_send_channel {
  mcast_join = 239.2.11.71
  mcast_if = en0
  port = 8649
  host = 192.168.12.200
}

It's the same for all 32 nodes, including the cluster's master node which
runs both gmond and gmetad.  All cluster nodes communicate on the private
network using the 'en0' interface. The master node uses 'en1' to
communicate with the external network.  The internal nodes also have an
'en1' interface but it is unused.

All 32 nodes have an /etc/hosts file with an identical mapping of private
IPs with hostnames as well as relavent external machines available through
a NAT server running on the master node.  When I switch from Ganglia
version 3.0.0 back to version 2.5.6 then I do see hostnames in the gmond
XML output, so I'm certain the hostname's are resolvable.

I forgot to mention in my original posting that the master node never
appears in the list of hosts.  The master node has an IP address
192.168.12.200 (the same as the host in the udp_send_channel entry above),
but I never see this host in the XML output.  The master node appears to
be communicating over the mcast channel because if I to a 'telnet
localhost 8649', but I never see metrics for the master node.

Thanks for your help,
Stephen

On Fri, 18 Feb 2005, Matt Massie wrote:


can you please show your udp_send_channel entries?  are you specifying
the correct interface in them?

the gmond that receives the message tries to resolve the source address
of the incoming packet that it receives.  the source address if
dependent on which interface that data was sent on (an ifconfig -a will
show you the local addresses associated with each interface).

the problem is if your resolver can resolve the ip address on a
particular interface (e.g. it can resolve public ip addresses but there
are no entries for private ip addresses).

-matt

Stephen Ficklin wrote:

I've installed Ganglia 3.0.0 on a 32 node cluster of XServe G5's with
Darwin Kenel 7.8.0 on master node and 7.7.1 on interior nodes.  The
metrics are much better with version 3.0.0 (which I'm very grateful for)
but I have a couple problems I was hopeing someone might have some insight
about.

Problem 1:  When running the configure script I get the following message
(gcc version is 3.3):

checking net/if.h presence... yes
configure: WARNING: net/if.h: present but cannot be compiled
configure: WARNING: net/if.h: check for missing prerequisite headers?
configure: WARNING: net/if.h: proceeding with the preprocessor's result
configure: WARNING:  ## ------------------------------------ ##
configure: WARNING:  ## Report this to [EMAIL PROTECTED] ##
configure: WARNING:  ## ------------------------------------ ##

The error messages in config.log are these:
configure:19987: checking net/if.h usability
configure:20000: gcc -c -g -O2 conftest.c >&5
In file included from configure:20069:
/usr/include/net/if.h:186: error: field `ifru_addr' has incomplete type
/usr/include/net/if.h:187: error: field `ifru_dstaddr' has incomplete type
/usr/include/net/if.h:188: error: field `ifru_broadaddr' has incomplete type
/usr/include/net/if.h:219: error: field `ifra_addr' has incomplete type
/usr/include/net/if.h:220: error: field `ifra_broadaddr' has incomplete type
/usr/include/net/if.h:221: error: field `ifra_mask' has incomplete type
/usr/include/net/if.h:290: error: field `addr' has incomplete type
/usr/include/net/if.h:291: error: field `dstaddr' has incomplete type
configure:20003: $? = 1 configure:
failed program was:
| #line 19989 "configure"
| /* confdefs.h.  */

...

I'm not sure if these errors have any bearing on the other problem I
have.  Despite the configuration errors, I'm still able to compile and run
both gmond and gmetad.


--
Problem 2:  The gmond daemons do not return the hostname, but rather an
IP address. Here's an example from the XML output (telnet localhost 8649):

<HOST NAME="192.168.12.220" IP="192.168.12.220" REPORTED="1108749163"
TN="19" TMAX="20" DMAX="0" LOCATION="unspecified" GMOND_STARTED="1108747603">

If I switch back to version 2.5.6 then the hostnames appear fine. In any
event, the hostnames are resolvable through the /etc/hosts file and a DNS
server that serves the cluster.


Thanks for any help,
Stephen


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'

   They that can give up essential liberty to obtain a little
      temporary safety deserve neither liberty nor safety.
  --Benjamin Franklin, Historical Review of Pennsylvania, 1759




-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'

   They that can give up essential liberty to obtain a little
      temporary safety deserve neither liberty nor safety.
  --Benjamin Franklin, Historical Review of Pennsylvania, 1759

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to