Hi folks,
I am trying to configure ganglia to monitor our cluster but I am
experiencing a lot of problems. Initially, after a lot of compilation
for different packages (rrdtool, libart, freetype, and, of course,
ganglia) I have the php application running but the information of each
node is not loaded correctly on the graphics. All the graphics are empty!
When I look at the apache logfiles, I got these messages:
server# tail -f /usr/local/apache2/logs/access_log
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-102.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6274
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-100.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6224
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-104.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6274
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-106.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6325
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-109.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6332
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-107.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6251
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-099.dat
a&l=e2ecff&v=0.02&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6300
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-087.dat
a&l=e2ecff&v=0.01&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6327
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-105.dat
a&l=e2ecff&v=0.01&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6247
10.1.2.41 - - [08/Nov/2006:11:02:34 -0800] "GET
/ganglia/graph.php?m=load_one&z=small&c=cerebro&h=cerebro-A-110.dat
a&l=e2ecff&v=0.01&x=0&n=0&r=hour&st=1163012545 HTTP/1.1" 200 6176
server # tail -f /usr/local/apache2/logs/error_log
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-023.data/load_one.rrd': No such
file or directory
ERROR: This RRD was created on other architecture
ERROR: This RRD was created on other architecture
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-029.data/load_one.rrd': No such
file or directory
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-030.data/load_one.rrd': No such
file or directory
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-031.data/load_one.rrd': No such
file or directory
ERROR: This RRD was created on other architecture
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-035.data/load_one.rrd': No such
file or directory
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-036.data/load_one.rrd': No such
file or directory
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-038.data/load_one.rrd': No such
file or directory
ERROR: This RRD was created on other architecture
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-040.data/load_one.rrd': No such
file or directory
ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-041.data/load_one.rrd': No such
file or directory
I don't know exactly what means "This RRD was created on other
architecture" because as I know, I compiled the rrdtool following the
instructions on
http://apstc.sun.com.sg/downloads/s10/README/rrdtool-1.2.11-sol10-x86.txt
Also, I don't know why most of the information is not generated by the
server: i.e.: ERROR: opening
'/var/lib/ganglia/rrds/cerebro/cerebro-B-041.data/load_one.rrd': No such
file or directory
Looking the status of "Hosts up" and "Hosts down", the application is
not working properly because for sure all my nodes are up and running
but the application is reporting that most of them are down:
CPUs Total: 74
Hosts up: 38
Hosts down: 230
We have 306 nodes and are reported only 268 (230 down).
For sure I am running on each node gmond but also I am running, and I
know I don't need it, gmetad. On the server I am running both, gmond and
gmetad.
Could somebody help me to figure out how to solve my problem? I
appreciate in advance.
- Hugo
--
Hugo R. Hernandez-Mora
System Administrator
Laboratory of Neuro Imaging, UCLA
635 Charles E. Young Drive South, Suite 225
Los Angeles, CA 90095-7332
Tel: 310.267.5076
Fax: 310.206.5518
[EMAIL PROTECTED]
"Si seus esforços, foram vistos com indefrença, não desanime,
que o sol faze un espectacolo maravilhoso todas as manhãs cuando
a maior parte das pessoas, ainda estam durmindo"