Hello, this behaviour is reported from time to time with unicast :) Use: send_metadata_interval = 600
(600, for example) on the gmond.conf for your nodes. The metrics should get back after a while. Louis 2010/11/17 Auld, Russell G CSC <[email protected]>: > I'm running ganglia 3.1.7 on some RHEL computers. > I have four separate clusters configured, with each one running in > unicast mode. Each cluster uses a different port number in their > gmond.conf files. > > Here's one example: > > udp_send_channel { > #bind_hostname = yes # Highly recommended, soon to be default. > # This option tells gmond to use a source address > # that resolves to the machine's hostname. > Without > # this, the metrics may appear to come from any > # interface and the DNS names associated with > # those IPs will be used to create the RRDs. > host = 192.168.115.100 # the gmond "collector" for this cluster > port = 8655 > ttl = 1 > } > > /* You can specify as many udp_recv_channels as you like as well. */ > udp_recv_channel { > port = 8655 > } > > /* You can specify as many tcp_accept_channels as you like to share > an xml description of the state of the cluster */ > tcp_accept_channel { > port = 8655 > } > > The above configuration is installed on each node in the cluster, > including the "collector" node. The collector node is identified in the > gmetad.conf file as a data source. > > The problem I'm having is that if the "collector" node's gmond is > restarted for whatever reason, no metrics are reported anymore for the > cluster. The front-end still shows the correct number of hosts, and they > all appear "up", but there just isn't any data flowing. If I restart > gmond on all the nodes in the cluster, things will all work again. > Is this a bug? Or is there something wrong with the configuration above? > > If I telnet to one of the nodes in the cluster using the specified port, > I get output, but there's no data in it as shown below. If I use the web > page to show the host report for the node, it reports that it's up, and > that it last reported 15 seconds ago (or less), but there are no metrics > shown on the page. > > > [h...@derp] ~ 133> telnet 192.168.115.164 8655 > Trying 192.168.115.164... > Connected to 192.168.115.164. > Escape character is '^]'. > <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> > <!DOCTYPE GANGLIA_XML [ > <!ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)*> > <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED> > <!ATTLIST GANGLIA_XML SOURCE CDATA #REQUIRED> > <!ELEMENT GRID (CLUSTER | GRID | HOSTS | METRICS)*> > <!ATTLIST GRID NAME CDATA #REQUIRED> > <!ATTLIST GRID AUTHORITY CDATA #REQUIRED> > <!ATTLIST GRID LOCALTIME CDATA #IMPLIED> > <!ELEMENT CLUSTER (HOST | HOSTS | METRICS)*> > <!ATTLIST CLUSTER NAME CDATA #REQUIRED> > <!ATTLIST CLUSTER OWNER CDATA #IMPLIED> > <!ATTLIST CLUSTER LATLONG CDATA #IMPLIED> > <!ATTLIST CLUSTER URL CDATA #IMPLIED> > <!ATTLIST CLUSTER LOCALTIME CDATA #REQUIRED> > <!ELEMENT HOST (METRIC)*> > <!ATTLIST HOST NAME CDATA #REQUIRED> > <!ATTLIST HOST IP CDATA #REQUIRED> > <!ATTLIST HOST LOCATION CDATA #IMPLIED> > <!ATTLIST HOST REPORTED CDATA #REQUIRED> > <!ATTLIST HOST TN CDATA #IMPLIED> > <!ATTLIST HOST TMAX CDATA #IMPLIED> > <!ATTLIST HOST DMAX CDATA #IMPLIED> > <!ATTLIST HOST GMOND_STARTED CDATA #IMPLIED> > <!ELEMENT METRIC (EXTRA_DATA*)> > <!ATTLIST METRIC NAME CDATA #REQUIRED> > <!ATTLIST METRIC VAL CDATA #REQUIRED> > <!ATTLIST METRIC TYPE (string | int8 | uint8 | int16 | uint16 | > int32 | uint32 | float | double | timestamp) #REQUIRED> > <!ATTLIST METRIC UNITS CDATA #IMPLIED> > <!ATTLIST METRIC TN CDATA #IMPLIED> > <!ATTLIST METRIC TMAX CDATA #IMPLIED> > <!ATTLIST METRIC DMAX CDATA #IMPLIED> > <!ATTLIST METRIC SLOPE (zero | positive | negative | both | > unspecified) #IMPLIED> > <!ATTLIST METRIC SOURCE (gmond) 'gmond'> > <!ELEMENT EXTRA_DATA (EXTRA_ELEMENT*)> > <!ELEMENT EXTRA_ELEMENT EMPTY> > <!ATTLIST EXTRA_ELEMENT NAME CDATA #REQUIRED> > <!ATTLIST EXTRA_ELEMENT VAL CDATA #REQUIRED> > <!ELEMENT HOSTS EMPTY> > <!ATTLIST HOSTS UP CDATA #REQUIRED> > <!ATTLIST HOSTS DOWN CDATA #REQUIRED> > <!ATTLIST HOSTS SOURCE (gmond | gmetad) #REQUIRED> > <!ELEMENT METRICS (EXTRA_DATA*)> > <!ATTLIST METRICS NAME CDATA #REQUIRED> > <!ATTLIST METRICS SUM CDATA #REQUIRED> > <!ATTLIST METRICS NUM CDATA #REQUIRED> > <!ATTLIST METRICS TYPE (string | int8 | uint8 | int16 | uint16 | > int32 | uint32 | float | double | timestamp) #REQUIRED> > <!ATTLIST METRICS UNITS CDATA #IMPLIED> > <!ATTLIST METRICS SLOPE (zero | positive | negative | both | > unspecified) #IMPLIED> > <!ATTLIST METRICS SOURCE (gmond) 'gmond'> > ]> > <GANGLIA_XML VERSION="3.1.7" SOURCE="gmond"> > <CLUSTER NAME="DERP" LOCALTIME="1290018259" OWNER="HERP" > LATLONG="unspecified" URL="unspecified"> > </CLUSTER> > </GANGLIA_XML> > Connection closed by foreign host. > > > > ------------------------------------------------------------------------------ > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today > http://p.sf.net/sfu/msIE9-sfdev2dev > _______________________________________________ > Ganglia-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/ganglia-general > ------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

