Hey guys:
I'm having a bit of a problem figuring out the problem with my Ganglia
install.
I have been running it on a 13 node cluster for a while and didn't have
any issues. Just recently, I added another data_source with 19 nodes
and now Ganglia is having problems getting the correct information.
I have the C3 tools loaded onto these nodes, and when I use cexec to
start up gmond simulatenously, Ganglia doesn't seem to be able to
collect the information correctly.
For instance, it says there are 19 hosts up, but only 28 CPUs total, but
they are all dual processor machines and they are definitely up and
have no problems.
In the 'physical view', for the nodes that have problems reporting the
information is like:
cpu: mem: 0
Any ideas why this is happening?
I'm running Ganglia v2.5.4-6 on RedHat 9.0.
Thanks,
Bernard