Re: [Ganglia-general] Gaps in data

2013-04-23 Thread Ramon Bastiaans
My guess is that it is caused by an erroneous (custom) gmetric causing an XML error. So far I have been unable to find what is causing this or how to reproduce this. It only seems to be triggered occasionally under certain conditions that are not clear to me. Dumping the XML at the time of th

Re: [Ganglia-general] Gaps in data

2013-04-23 Thread Vladimir Vuksan
Based on your graphs this happens randomly ? It would be interesting to see if you cannot connect to gmetad during those times. Stracing gmetad and doing netstat -an | grep 865 may be helpful. BTW there is a gmetad health checker someone wrote which may alert you to this situation early. htt

Re: [Ganglia-general] Gaps in data

2013-04-23 Thread Ramon Bastiaans
We detect it when the website stops responding (as described on ganglia-developers list). Then it is 'fixed' by indeed simply restarting gmetad. As of January 2013, SARA has a new name: SURFsara. ing. Ramon Bastiaans - Senior Systems Programmer - Cluster Computing | Operations, Support & Devel