the problem is in process_collection_group() function near line 1600 of ./gmond/gmond.c. when there is a communication error, it can return the number zero which causes gmond to listen to data forever but not send any. the fix is to change the process_collection_group() retun to look like the following around line 1640 in ./gmond/gmond.c....
return next < now ? now + 1 * APR_USEC_PER_SEC: next;this will ensure that the next collection event will occur. sorry for any inconvenience this bug may have caused.
please let us know if this fixes your problem for you. my tests have shown it to be an effective workaround.
-matt Rainer Schwierz wrote:
Hello all, I have upgraded to ganglia-web-3.0.0-1 ganglia-gmetad-3.0.0-1 ganglia-gmond-3.0.0-1 running Scientific Linux SL303 on all nodes. I prefer to use unicast for gmond communication. First all is working well, but after some hours some nodes disappear. After one week I only see the two special nodes, which are contacted by gmetad from the webserver. A telnet to the port on these two hosts shows that the metric info from the other nodes disappears. A restart of gmond on the nodes solves the problem again for some hours. Does someone see a similar problem or any idea to solve it before I post my detailed configuration ... Rainer | Rainer Schwierz, Inst. f. Kern- und Teilchenphysik | | TU Dresden, D-01062 Dresden | | http://iktp.tu-dresden.de/~schwierz/ | ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Ganglia-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/ganglia-general
--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'
They that can give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety.
--Benjamin Franklin, Historical Review of Pennsylvania, 1759
signature.asc
Description: OpenPGP digital signature
