Thanks for the tips.  The attached tar.gz file is the result of adding a call 
to phpinfo() at the top of graph.php and switching on the debug flag.  At the 
bottom is the command that is being run to try to generate a graph for one of 
the nodes that is actually in a different cluster.  Here is the command.

/usr/local/rrdtool/bin/rrdtool graph - --start 1194960576 --end 1194964176 
--width 800 --height 600 --lower-limit 0 --rigid --title 
'quad001.beowulf.cluster Load last hour' --vertical-label 'Load/Procs' 
DEF:'load_one'='/var/lib/ganglia/rrds/NEMO cluster @ 
POL/quad001.beowulf.cluster/load_one.rrd':'sum':AVERAGE 
DEF:'proc_run'='/var/lib/ganglia/rrds/NEMO cluster @ 
POL/quad001.beowulf.cluster/proc_run.rrd':'sum':AVERAGE 
DEF:'cpu_num'='/var/lib/ganglia/rrds/NEMO cluster @ 
POL/quad001.beowulf.cluster/cpu_num.rrd':'sum':AVERAGE 
AREA:'load_one'#CCCCCC:'1-min Load' LINE2:'cpu_num'#FF0000:'CPUs' 
LINE2:'proc_run'#0000FF:'Running Processes'

There is no directory named "quad001.beowulf.cluster" in 
"/var/lib/ganglia/rrds/NEMO cluster @ POL".   This explains why the command 
fails leading to a blank space on the page.

For the nodes that are correctly listed as being at POL it is difficult to 
find out why the wrong information is being displayed.  Can I trust Ganglia 
to display the correct information from the rrds files or is there another 
way to find out what they contain?

One clue is in the information being shown for node025 at POL.  This node is 
being shown as down on the POL cluster report, but the node is in fact alive 
and well.  The output from gmond at POL shows the correct information and the 
rrds files for this node are being updated every minute as you would expect.  
This suggests to me that the rrds files for the POL cluster do contain the 
right information.  The rrds files for the BAS cluster are in directory 
"/var/lib/ganglia/rrds/ClusterVision Cluster/node025.beowulf.cluster".  They 
have not been updated for the past few days, as you would expect for a node 
that is down.  So it seems that Ganglia is getting at least some of the 
information from the wrong rrds files even though the paths in the rrdtool 
commands are correct.

I suppose the next step is to debug the other php scripts involved.  Where is 
the best place to start?

-Dan.

On Monday 12 Nov 2007 18:59, aurbain wrote:
> flick on the debug flag in graph.php, and redraw the page.  You should
> see the command line which would get run to make the graph.  Copy and
> run > test.png and take a look at it.
>

-- 
Mr. D.A. Bretherton
Environmental Systems Science Centre
Harry Pitt Building
3 Earley Gate
Reading University
Reading, RG6 6AL
UK

Tel. +44 118 378 7722

Attachment: quad001.graph.php.html.tar.gz
Description: application/tgz

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to