> implementation based on the availability of the ganglia-nagios > plugin, but found that because of the way it is implemented it > quickly choked the life out of the monitoring server (5500 probes > over a 5 minute interval each of which downloads a full copy of the > XML file from the GMond collector's accept channel is quite an > overhead)...
Malcolm, Are you running the version of check_ganglia from exchange.nagios.org (http://bit.ly/akp88N)? This includes a mode that uses the query server provided by gmetad to only grab the XML for a specific host rather than dumping the entire tree. I haven't used this particular plugin myself (I've written my own), but it looks like the invocation would be something like: check_ganglia -H localhost -P 8652 \ -O hostcheck \ --cluster 'My Cluster Name' \ -T my.target.host In experiments with my own code, using the gmetad query interface is substantially (3x to 4x) faster -- and this is with a small (< 50) set of monitored hosts. This should compare favorably with the overhead of polling a RRD file (and seems like a cleaner solution). ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

