bob flynn wrote:
A quick query looking for experience of ganglia on lsf clusters.
I have being using mrtg to this point on a mixed Solaris + Linux cluster.
From an lsf perspective, the users see the cluster as either;
- entire cluster
- solaris only
- linux only
Until recently, we were using standard mrtg, without rrdtool. This
caused a problem with
averaging of cpu usage across the cluster, as mrtg really was not up
to the task ( as far as I can see ).
I am currently redoing this, and the first cut at it is to update
mrtg, implement rrd backend and
generate stats on the fly with greater intelligence.
Now I am wondering if this is simply the wrong tool for the task, and
I should be looking at something like ganglia instead,
A couple of things.
1. Can I collect both linux and solaris node data and present them to
a single linux front end.
Yes. You would likely run the ganglia meta daemon (gmetad) on this linux
box as well as the web front end.
2. Can I generate data on sub clusters, ie the linux and solaris
specific views, as well as generating
overall view across all machines.
Yes, you would have 1 Grid with two clusters contained within. Your
gmetad.conf would look like this:
data_source "linux cluster" 10 linuxnode1:8649 linuxnode2:8649
data_source "solaris cluster" 10 solarisnode1:8650 solarisnode2:8650
gridname "entire cluster"
Thanks for any insight, experience relating to this.
-Bob
Good Luck,
Ian