I'm running ganglia version 3.1.0 and periodically gmond seems to stop
collecting data. I can see the incoming traffic to the server and the
web pages show the hosts as being up, but no data is logged.
Originally I started seeing this on 3.0.5 so I upgraded.. but the
lastest seems to have made the problem worse. Sometimes the graphs
start showing data on their own, but usually I have to restart gmond on
the clients.
The clients and collector are both running 3.1.0. I tried running
multicast to see if anything changed but the results were the same.
Dumping the rrd files shows values of NaN in these gaps. Both gmond
and gmetad are running at the time. Querying gmetad on the xml port
shows data for all of the hosts. Any ideas?
This is a snip of my collector gmond.conf
globals {
daemonize = yes
setuid = yes
user = nobody
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 3600 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 0
}
cluster {
name = "Production"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
host {
location = "XXXX"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
host = XXXX
port = 3003
ttl = 1
}
udp_recv_channel {
port = 3003
acl {
default="deny"
access {
ip=10.50.10.0
mask=24
action="allow"
}
access {
ip=172.17.6.0
mask=24
action="allow"
}
access {
ip=172.17.8.0
mask=24
action="allow"
}
}
}
tcp_accept_channel {
port = 3003
acl {
default="deny"
access {
ip=10.50.10.0
mask=24
action="allow"
}
access {
ip=172.17.6.0
mask=24
action="allow"
}
access {
ip=172.17.8.0
mask=24
action="allow"
}
}
}
And the client gmond.conf -
globals {
daemonize = yes
setuid = yes
user = nobody
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 3600 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 0
}
cluster {
name = "Production"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
host {
location = "XXXX"
}
udp_send_channel {
host = XXXX
port = 3003
ttl = 1
}
Heres a snipt of the host reported from gmetad, all of the metrics
show data-
<HOST NAME="GOODHOST1" IP="172.17.6.126"
REPORTED="1219182932" TN="18" TMAX="20" DMAX="3600" LOCATION="Lisle"
GMON D_STARTED="1219182082">
<HOST NAME="BADHOST1" IP="172.17.6.127"
REPORTED="1219182936" TN="14" TMAX="20" DMAX="3600" LOCATION="Lisle"
GMON D_STARTED="1219179765">
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general