bob flynn wrote:

A quick query looking for experience of ganglia on lsf clusters.

I have being using mrtg to this point on a mixed Solaris + Linux cluster.

From an lsf perspective, the users see the cluster as either;

- entire cluster
- solaris only
- linux only

Until recently, we were using standard mrtg, without rrdtool. This caused a problem with averaging of cpu usage across the cluster, as mrtg really was not up to the task ( as far as I can see ).

I am currently redoing this, and the first cut at it is to update mrtg, implement rrd backend and
generate stats on the fly with greater intelligence.

Now I am wondering if this is simply the wrong tool for the task, and I should be looking at something like ganglia instead,

A couple of things.

1. Can I collect both linux and solaris node data and present them to a single linux front end.

Yes. You would likely run the ganglia meta daemon (gmetad) on this linux box as well as the web front end.

2. Can I generate data on sub clusters, ie the linux and solaris specific views, as well as generating
    overall view across all machines.

Yes, you would have 1 Grid with two clusters contained within. Your gmetad.conf would look like this:

data_source "linux cluster" 10 linuxnode1:8649 linuxnode2:8649
data_source "solaris cluster" 10 solarisnode1:8650 solarisnode2:8650

gridname "entire cluster"

Thanks for any insight, experience relating to this.

-Bob

Good Luck,
Ian


Reply via email to