Thanks for all the help...
Ramon and Bernard, your hints led me to check the XML stream and I
noticed that one machine was firing out the custom gmetric (I had
thought, I had cleared them all out) - this was running as a cron job
every 10 minutes so what was happening was that somehow that cronjob
was screwing up the XML tree for the other ones...
Weirdness - anyways - it's back up and working now\ - now to get back
to figuring how to get that custom gmetric to work properly...
Thanks everyone for your help
Chris
On Apr 28, 2005, at 2:33 AM, Ramon Bastiaans wrote:
Perhaps something got screwed up when you called gmetric, and the XML
syntax is now broken for those nodes.
Did you call gmetric with the -d option? If not, the custom metric XML
will stay around forever as -d specifies a timeout for metric's.
Please try calling gmetric again for those custom metrics on the nodes
affected, but this time add '-d 1'. With DMAX set to 1, the metric
will timeout after 1 second and should override any old custom metric
garbage and disappear. If it broke the XML tree it should fix it.
- Ramon.
Christopher Yip wrote:
Can anyone suggest what might have happened in the following?
I was trying to pull together a custom metric for ganglia and now,
the machines that it was set up on, their graphical data does not
appear on the Ganglia web pages even though the numerical
information is correct.
You can see the problem here:
http://128.100.71.70:81/ganglia/?
c=Yip_Lab_Systems&m=&r=hour&s=descending&hc=4
I restarted the head node that tracks everything, reset the RRDs, etc
and nothing seems to work. If I restart gmond on those affected
machines, you sometimes get a spike on the graph and then nothing -
again, the numerical information is correct, just no trend data...
I've since removed the custom metric but something else is messed up
and I can't figure out where the problem is - i.e. is it on the
clients that are providing the data, or the one that is collecting
it..?.. I did look on the list but the situations that talk about
no images refer to incorrect permissions, or the whole cluster is
off - in this case, it's only selected machines that seemed to be
affected...
Thanks
Chris
Christopher M. Yip, Ph.D., P.Eng.
Associate Professor - Canada Research Chair in Molecular Imaging
Department of Chemical Engineering and Applied Chemistry
Department of Biochemistry
Institute of Biomaterials and Biomedical Engineering
University of Toronto
407 Rosebrugh Building
4 Taddle Creek Rd
Toronto, Ontario, CANADA M5S 3G9
(416) 978-7853
(416) 978-4317 (fax)
[EMAIL PROTECTED]
http://bigten.ibme.utoronto.ca
-------------------------------------------------------
SF.Net email is sponsored by: Tell us your software development plans!
Take this survey and enter to win a one-year sub to SourceForge.net
Plus IDC's 2005 look-ahead and a copy of this survey
Click here to start! http://www.idcswdc.com/cgi-bin/survey?id=105hix
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general
--
"... being a Linux user is sort of like living in a house inhabited
by a large family of carpenters and architects. Every morning when
you wake up, the house is a little different. Maybe there is a new
turret, or some walls have moved. Or perhaps someone has temporarily
removed the floor under your bed." - Unix for Dummies, 2nd Edition
Christopher M. Yip, Ph.D., P.Eng.
Associate Professor - Canada Research Chair in Molecular Imaging
Department of Chemical Engineering and Applied Chemistry
Department of Biochemistry
Institute of Biomaterials and Biomedical Engineering
University of Toronto
407 Rosebrugh Building
4 Taddle Creek Rd
Toronto, Ontario, CANADA M5S 3G9
(416) 978-7853
(416) 978-4317 (fax)
[EMAIL PROTECTED]
http://bigten.ibme.utoronto.ca