Poirier, Keith wrote:
Just wondering if anyone has done any modification of the Ganglia Web-fronted with regards to large (1000+ nodes) clusters. I see that the meta view handles multiple clusters, but when it comes to a single large cluster, the graphics of the nodes (up/down/loaded etc) and the graphs that appear on the cluster_view, can be overwhelming.

I wanted to see what others have done before trying to reinvent the wheel…



Keith Poirier

Hewlett Packard

[EMAIL PROTECTED]


The only modifications I've made have been to display more nodes per row in the graphical color-coded node status area and use smaller node status graphics. You may also want to investigate a "max_graphs" value in the web front-end config code...

Using a dual-processor Sun E420R, the full gmetad stream (which the web front-end parses on every non-graph.php page load) takes about ten seconds to parse. I'm agitating to put a query engine of some kind into gmetad on the developers list (but, of course, I don't want to write it). :)

Also, the summary RRDs seem to stop updating after a while (== "anywhere from an hour to two days"). Not sure what that's about... perhaps it's a special Solaris-only memleak in librrdtool...

Those are the only real gotchas I've encountered so far. I'd be interested to hear what sort of gmetad performance others are seeing in large clusters. And especially what kind of system's running their web front-end. :)


Reply via email to