Ben, thanks for your reply. I've considered keeping state files and calculating the differences, but this doesn't scale very well for a large number of hosts. RRDTool can take both counter and gauge data (and several other types as well). It seems inefficient to calculate the diffs myself when RRDTool can do it for me. What is the purpose of the 'slope' option for gmetric? The following email implies that this could be used for both counter and gauge data: http://sourceforge.net/mailarchive/message.php?msg_id=2133761
After a quick glance through the source, the only effect slope seems to have is whether it's zero or not. Was the functionality from the above email never implemented? ----- Original Message ---- On Thu, Jul 13, 2006 at 10:44:46AM -0700, dan c wrote: > > I'm trying to use gmetric to record data from several counters that > do not reset after they've been read. I tried setting the slope to > negative, positive and both, but all three produce identical output. > Any ideas? My solution was to make a directory /var/lib/ganglia/metrics/ and put state files there. By recording the time and the value the last time gmetric was run, you can calculate the average change per unit time without predefining the time period. I usually use 2 minutes (from cron), but of course the timeperiod you should use is determined by your data resolution requirements, as well as how the data changes. Also, it's nice to be able to change it without touching either the statefile or the script, and it deals with changes in load that might cause your script to run something other then *exactly* every 2 minutes. (Mine still crash when the load gets so high that several instantiations of th script build up without being able to run, and then run all at once; they complain if the time diff is zero.) Look at just about any of the scripts in http://ben.hartshorne.net/ganglia/ for examples of how I have done this. Some of the metrics that I measured that don't reset include the mysql "Questions" count (to calculate queries per second) and the disk activity straight from /proc (for which you also need to deal with your counter looping back to zero after hitting its max bound). -ben -- Ben Hartshorne email: [EMAIL PROTECTED] http://ben.hartshorne.net

