I've been thinking about sar in the light of some experience within my own organization and Garrett's EOF of sag.
It's probably fair to say that sar is heavily used in some organizations (and less so in others). We collect it anyway, so I've been doing some data gathering based on it, which exposes some of its limitations: 1. No networking. This is an absolute killer. The concept of networking seems to have been somewhat, ahem, neglected. 2. No NFS or fsstat statistics. 3. Missing metadata. For instance, sar shows me that there was I/O on disk nfs12345, but fails to record what that might have referred to. 4. Generally missing data, and some data is only available in aggregate. One of the things I've been planning to add to JKstat for a long while is the ability to read saved kstat -p output. (I'm essentially done with that, and it works quite well.) Which got me wondering, and I then started to ask myself: What if we just scrapped the current sar collector (sadc) and just saved kstat -p output (or something like it) instead? The advantage of doing this are: - we gain all available statistics - network, nfs, fsstat, but also a whole host of others - we could rerun the *stat utilities on the output (they would need to be modified, of course, but you get the idea) - we aren't limited to the output predefined up front, we can massage the data afterwards as we see fit - extensibility is free, if a new kstat is added we can easily get access to its history - essentially any kstat consumer can be used to analyse the data It should be relatively easy to write a replacement for sar that could read a kstat archive; you just do the aggregation and massage of statistics when the data is displayed, rather than when its collected. Is this practical? How big is the data? On one of my machines, a thor, a regular sar datapoint is about 64k. The kstat -p output is about a meg, but zip gets that down to 160k, and 7z does significantly better. On a T5140, the sar data is about 20k and kstat about 2 meg, so there the difference is much larger. But I feel the idea has promise - in the best case we're almost there already, without optimizing the saved format or trimming unnecessary data. Note that while I mentioned something like kstat -p, I'm not necessarily thinking of that as the final format - you would want something more compact, and wouldn't necessarily want to encode numbers as strings. And also, I'm not necessarily thinking of just dumping the whole kstat tree, although having looked at some systems I think you would end up with most of it anyway, and I would be in favour of simply saving the lot which would give you a lot more latitude in datamining. Thoughts, anyone? -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ _______________________________________________ sysadmin-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss
