I had posted a note about this on the developers list last week but didn't get too much action so I thought I'd try out the general list this time.
To cut to the chase, I believe that when monitoring a large cluster centrally as one does with ganglia, one must carefully chose the types and frequency of the data being collected to avoid overwhelming the central data store. At the same time, if you want to use that data for diagnostic purposes to trouble shoot a problem, samples taken at a frequency measured in minutes simply isn't going to provide the granularity to diagnose the problem - sometimes you'll get lucky and you will have enough data, but you're certainly never going to see that 20 second interval during which the network is saturated or the CPU is pegged at 100%. Collectl, for those who are not familiar with it - see http://collectl.sourceforge.net/, is a low overhead data collector that can collectl a vast amount of data and store it locally at very low overhead and I'm in the final stages of testing an interface that will allow one to send UDP packets to a gmond for ultimate delivery to the gmetad. For example with collectl you can get at a lot of data not easily obtained elsewhere such as infiniband, interrupts, lustre or nfsstats just to name a few. The big different with collectl is that data is stored in a common format and can easily be plotted as well. What I'm looking for is whether or not there is any interest in the ganglia community for this sort of capability and even more important is there anyone who might like to try out a pre-release of the ganglia interface. -mark ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

