james et al-

i *think* i've made gmond more bullet-proof to problems caused when a
client prematurely closes its connection to the XML port.  this will
hopefully fix the problem where gstat can crash gmond.  let me know
otherwise.

see the latest CVS
-matt

Aug 7, James Braid wrote forth saying...

> > that is the same problem.  heartbeat messages are sent every 
> > 15 seconds so 
> > if a machine doesn't get a heartbeat message in 60 seconds (4 missed 
> > heartbeats) it assumes it is down.  if you use the latest CVS 
> > source you 
> > should see the problem no longer is there.  let me know otherwise.
> 
> Just checked out the latest CVS (at 13:06 NZST). It looks like that
> problem is fixed. The REPORTED value looks like its incrementing as it
> should....sweet!
> 
> BUT
> 
> Gstat doesn't like talking to gmond (this happens whether I try to
> connect using gstat on the local machine, or remotely...
> 
> The following is from gmond running with debug = 6
> 
> <snip>
> 4 pre_process_node() remote_ip=10.0.1.130
> pre_process_node() HOSTNAME =tycho.peace.co.nz
> pre_process_node() TIMESTAMP=1028682333
> pre_process_node() HASHP    =100128bc0
> pre_process_node() USER_HASHP=1001290e0
> pre_process_node() returning the ganglia internal hash pointer 100128bc0
> mcast_listen_thread() got internal hash 100128bc0
> mcast_listen_thread() built metricdata struct
> mcast_listen_thread() attempting to hash_insert_data
> mcast_listen_thread() inserted data into 100128bc0
> server_thread() 6 clientfd = 11
> 
> sent data to host 127.0.0.1
> Broken Pipe
> </snip>
> 
> However, telnetting to the port works fine (and I can do this as much as
> I like and gmond stays alive, just running gstat kills it straight
> away):
> 
> lilo:~# telnet tycho 8649
> Trying 10.0.1.130...
> Connected to tycho.peace.co.nz.
> Escape character is '^]'.
> <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
> <!DOCTYPE GANGLIA_XML [
>    <!ELEMENT GANGLIA_XML (CLUSTER)+>
>    <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED
>                          SOURCE  CDATA #REQUIRED>
> 
> Etc, etc, etc (all the stats are there)...
> 
> Any ideas about this one?
> 
> Thanks for the prompt and helpful answers
> 
> James
> 
> 


Reply via email to