I'm running ganglia 3.1.7 on some RHEL computers.
I have four separate clusters configured, with each one running in
unicast mode. Each cluster uses a different port number in their
gmond.conf files.
Here's one example:
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname.
Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
host = 192.168.115.100 # the gmond "collector" for this cluster
port = 8655
ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8655
}
/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
port = 8655
}
The above configuration is installed on each node in the cluster,
including the "collector" node. The collector node is identified in the
gmetad.conf file as a data source.
The problem I'm having is that if the "collector" node's gmond is
restarted for whatever reason, no metrics are reported anymore for the
cluster. The front-end still shows the correct number of hosts, and they
all appear "up", but there just isn't any data flowing. If I restart
gmond on all the nodes in the cluster, things will all work again.
Is this a bug? Or is there something wrong with the configuration above?
If I telnet to one of the nodes in the cluster using the specified port,
I get output, but there's no data in it as shown below. If I use the web
page to show the host report for the node, it reports that it's up, and
that it last reported 15 seconds ago (or less), but there are no metrics
shown on the page.
[h...@derp] ~ 133> telnet 192.168.115.164 8655
Trying 192.168.115.164...
Connected to 192.168.115.164.
Escape character is '^]'.
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE GANGLIA_XML [
<!ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)*>
<!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED>
<!ATTLIST GANGLIA_XML SOURCE CDATA #REQUIRED>
<!ELEMENT GRID (CLUSTER | GRID | HOSTS | METRICS)*>
<!ATTLIST GRID NAME CDATA #REQUIRED>
<!ATTLIST GRID AUTHORITY CDATA #REQUIRED>
<!ATTLIST GRID LOCALTIME CDATA #IMPLIED>
<!ELEMENT CLUSTER (HOST | HOSTS | METRICS)*>
<!ATTLIST CLUSTER NAME CDATA #REQUIRED>
<!ATTLIST CLUSTER OWNER CDATA #IMPLIED>
<!ATTLIST CLUSTER LATLONG CDATA #IMPLIED>
<!ATTLIST CLUSTER URL CDATA #IMPLIED>
<!ATTLIST CLUSTER LOCALTIME CDATA #REQUIRED>
<!ELEMENT HOST (METRIC)*>
<!ATTLIST HOST NAME CDATA #REQUIRED>
<!ATTLIST HOST IP CDATA #REQUIRED>
<!ATTLIST HOST LOCATION CDATA #IMPLIED>
<!ATTLIST HOST REPORTED CDATA #REQUIRED>
<!ATTLIST HOST TN CDATA #IMPLIED>
<!ATTLIST HOST TMAX CDATA #IMPLIED>
<!ATTLIST HOST DMAX CDATA #IMPLIED>
<!ATTLIST HOST GMOND_STARTED CDATA #IMPLIED>
<!ELEMENT METRIC (EXTRA_DATA*)>
<!ATTLIST METRIC NAME CDATA #REQUIRED>
<!ATTLIST METRIC VAL CDATA #REQUIRED>
<!ATTLIST METRIC TYPE (string | int8 | uint8 | int16 | uint16 |
int32 | uint32 | float | double | timestamp) #REQUIRED>
<!ATTLIST METRIC UNITS CDATA #IMPLIED>
<!ATTLIST METRIC TN CDATA #IMPLIED>
<!ATTLIST METRIC TMAX CDATA #IMPLIED>
<!ATTLIST METRIC DMAX CDATA #IMPLIED>
<!ATTLIST METRIC SLOPE (zero | positive | negative | both |
unspecified) #IMPLIED>
<!ATTLIST METRIC SOURCE (gmond) 'gmond'>
<!ELEMENT EXTRA_DATA (EXTRA_ELEMENT*)>
<!ELEMENT EXTRA_ELEMENT EMPTY>
<!ATTLIST EXTRA_ELEMENT NAME CDATA #REQUIRED>
<!ATTLIST EXTRA_ELEMENT VAL CDATA #REQUIRED>
<!ELEMENT HOSTS EMPTY>
<!ATTLIST HOSTS UP CDATA #REQUIRED>
<!ATTLIST HOSTS DOWN CDATA #REQUIRED>
<!ATTLIST HOSTS SOURCE (gmond | gmetad) #REQUIRED>
<!ELEMENT METRICS (EXTRA_DATA*)>
<!ATTLIST METRICS NAME CDATA #REQUIRED>
<!ATTLIST METRICS SUM CDATA #REQUIRED>
<!ATTLIST METRICS NUM CDATA #REQUIRED>
<!ATTLIST METRICS TYPE (string | int8 | uint8 | int16 | uint16 |
int32 | uint32 | float | double | timestamp) #REQUIRED>
<!ATTLIST METRICS UNITS CDATA #IMPLIED>
<!ATTLIST METRICS SLOPE (zero | positive | negative | both |
unspecified) #IMPLIED>
<!ATTLIST METRICS SOURCE (gmond) 'gmond'>
]>
<GANGLIA_XML VERSION="3.1.7" SOURCE="gmond">
<CLUSTER NAME="DERP" LOCALTIME="1290018259" OWNER="HERP"
LATLONG="unspecified" URL="unspecified">
</CLUSTER>
</GANGLIA_XML>
Connection closed by foreign host.
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3.
Spend less time writing and rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general