Hi Chris: On Thu, Jan 6, 2011 at 9:04 AM, Chris Hunter <[email protected]> wrote:
> We are using gmetad 3.1.7. We see occasional XML parse errors from > clusters gmond 3.0.7 & gmond 3.1.2. We've seen the behaviour with & > without rrdached running. Just to clarify, you have 2 clusters one running 3.0.7 and one running 3.1.2 right? They are not under the same data_source, right? > The XML Parse errors seem to hang up gmetad. It stops collecting from > the clusters & hangs the webUI. gmetad daemon has to be restarted for > the web frontend to function. Oddly, gmetad keeps collecting from other > clusters with gmond 3.0.2 & 3.0.7. > > My guess that it's load related. We run ganglia gmetad + web in a > virtual machine. The VM has ~1sec pauses when the physical server writes > out VM image file to physical disk. How many hosts are you monitoring? And have you tried setting up gmetad on a physical server to see if you encounter the same issue (no need to setup web)? > Could also be the VM clock gets skewed relative to the cluster gmond > servers. Are all your servers (gmetad + gmonds) time synced using ntp? Cheers, Bernard ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

