Poirier, Keith wrote:
Just wondering if anyone has done any modification of the Ganglia
Web-fronted with regards to large (1000+ nodes) clusters. I see that the
meta view handles multiple clusters, but when it comes to a single large
cluster, the graphics of the nodes (up/down/loaded etc) and the graphs
that appear on the cluster_view, can be overwhelming.
I wanted to see what others have done before trying to reinvent the wheel…
Keith Poirier
Hewlett Packard
[EMAIL PROTECTED]
The only modifications I've made have been to display more nodes per row in
the graphical color-coded node status area and use smaller node status
graphics. You may also want to investigate a "max_graphs" value in the web
front-end config code...
Using a dual-processor Sun E420R, the full gmetad stream (which the web
front-end parses on every non-graph.php page load) takes about ten seconds
to parse. I'm agitating to put a query engine of some kind into gmetad on
the developers list (but, of course, I don't want to write it). :)
Also, the summary RRDs seem to stop updating after a while (== "anywhere
from an hour to two days"). Not sure what that's about... perhaps it's a
special Solaris-only memleak in librrdtool...
Those are the only real gotchas I've encountered so far. I'd be interested
to hear what sort of gmetad performance others are seeing in large
clusters. And especially what kind of system's running their web
front-end. :)