Hi Chris:

On Thu, Jan 6, 2011 at 9:04 AM, Chris Hunter <[email protected]> wrote:

> We are using gmetad 3.1.7. We see occasional XML parse errors from
> clusters gmond 3.0.7 & gmond 3.1.2. We've seen the behaviour with &
> without rrdached running.

Just to clarify, you have 2 clusters one running 3.0.7 and one running
3.1.2 right?  They are not under the same data_source, right?

> The XML Parse errors seem to hang up gmetad. It stops collecting from
> the clusters & hangs the webUI. gmetad daemon has to be restarted for
> the web frontend to function. Oddly, gmetad keeps collecting from other
> clusters with gmond 3.0.2 & 3.0.7.
>
> My guess that it's load related. We run ganglia gmetad + web in a
> virtual machine. The VM has ~1sec pauses when the physical server writes
> out VM image file to physical disk.

How many hosts are you monitoring?  And have you tried setting up
gmetad on a physical server to see if you encounter the same issue (no
need to setup web)?

> Could also be the VM clock gets skewed relative to the cluster gmond
> servers.

Are all your servers (gmetad + gmonds) time synced using ntp?

Cheers,

Bernard

------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to