Hello all, I already have ganglia running just fine on one of our
clusters, but I ran into strange trouble when installing and configuring
it for the new cluster. The setup is as follows: the cluster is in its
own 192.168.0. network, the cluster frontend has 2 network interfaces,
all nodes + frontend run gmond whereas gmetad + http server are located
on a separate machine. Frontend's name is klunga, nodes are node1, node2
...
Now if I telnet from www-server to klunga port 8649, I get klunga's
metrics just fine (from load_one to mem_total), but for example only a
couple of rows about node1's resources:
<HOST NAME="node1" IP="192.168.0.101" REPORTED="1060084333" TN="858"
TMAX="20" DMAX="0" LOCATION="unspecified" GMOND_STARTED="19622">
<METRIC NAME="gexec" VAL="OFF" TYPE="string" UNITS="" TN="135"
TMAX="300" DMAX="0" SLOPE="zero" SOURCE="gmond"/>
<METRIC NAME="machine_type" VAL="x86" TYPE="string" UNITS="" TN="204"
TMAX="1200" DMAX="0" SLOPE="zero" SOURCE="gmond"/>
<METRIC NAME="disk_total" VAL="6.186" TYPE="double" UNITS="GB" TN="52"
TMAX="1200" DMAX="0" SLOPE="both" SOURCE="gmond"/>
<METRIC NAME="os_release" VAL="2.4.21-0.13mdkcustom" TYPE="string"
UNITS="" TN="145" TMAX="1200" DMAX="0" SLOPE="zero" SOURCE="gmond"/>
<METRIC NAME="disk_free" VAL="4.376" TYPE="double" UNITS="GB" TN="16"
TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>
</HOST>
The same thing happens, when I telnet in klunga to localhost 8649. But
if I try to telnet node1 from klunga, I get all the proper information
from load_one to mem_total of the node1, whereas klunga's reported
resources by node1 are only these:
<HOST NAME="klunga" IP="192.168.0.1" REPORTED="1060094821" TN="896"
TMAX="20" DMAX="0" LOCATION="unspecified" GMOND_STARTED="1622">
<METRIC NAME="gexec" VAL="OFF" TYPE="string" UNITS="" TN="114"
TMAX="300" DMAX="0" SLOPE="zero" SOURCE="gmond"/>
<METRIC NAME="machine_type" VAL="x86" TYPE="string" UNITS="" TN="320"
TMAX="1200" DMAX="0" SLOPE="zero" SOURCE="gmond"/>
<METRIC NAME="disk_total" VAL="202.583" TYPE="double" UNITS="GB"
TN="381" TMAX="1200" DMAX="0" SLOPE="both" SOURCE="gmond"/>
<METRIC NAME="os_release" VAL="2.4.21-0.13mdkenterprise" TYPE="string"
UNITS="" TN="366" TMAX="1200" DMAX="0" SLOPE="zero" SOURCE="gmond"/>
<METRIC NAME="disk_free" VAL="194.706" TYPE="double" UNITS="GB" TN="53"
TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>
</HOST>
Any ideas?
--
[EMAIL PROTECTED] | Department of Applied Physics
tel. +358 17 162 279 | University of Kuopio