We have an impressively messed up ganglia, and I would like suggestions as
to where to start to find the problem.
Here is the situation. We have two small clusters, each with their own
head node. Both are running rocks 4.0 and gmetd 2.5.7. The first has nodes
on 10.255.255.X, the second on 20.255.255.Y. The nodes on the first are
compute-0-0 -> compute-0-9; on the second compute-1-0 -> compute-1-8.
Everything is behaving fine (except ganglia) on both clusters. On the
first, ganglia is fine. On the second it is nuts. The cluster report has
entries for compute-0-1->compute-0-7; the only correct one is compute-1-8
(which is down). The metrics are summing everything onto the head node;
oddly enough showing the correct values for the computer nodes. It looks
as if some configuration file somewhere has got incorrect information
about which node is which, but where....?
-----------------------------------------------
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
http://www.numis.northwestern.edu
-----------------------------------------------