Peter, I like where you are going with this!
Most of my thoughts about your proposal are implementation specific and revolve around not using 'kstat -p' output. While useful for prototyping or illustrating the idea, there are some limitations inherent in kstat(1M) that ought to be addressed. Foremost is the current lack of support for 64-bit integers in the Kstat(3Perl) module, which is why kstat fields like 'snaptime' are expressed as a floating-point value instead of an integer. I haven't been following perl actively for some time, so I don't know if this is a current limitation or just an older one baked into Kstat(3Perl). I think a successful kstat stream archive format ( kar? ) will need some ancillary record types in addition to the kstats; eg time base records to help give meaning to individual kstat snaptime/crtime values, hardware/software version information, and so on. I can also see arguments for a binary recording format versus a human-readable format. I would encourage XDR as the binary storage format and providing a suite of tools to transform the archive into whatever format the end-user may desire (XML, SQL ready tables, etc ). XDR is mature, portable, reasonably fast and has a plethora of language bindings which would support building the tool suite component (and I've already written a kstat XDR filter! ). Since the data could potentially be generated on one host and consumed on another host, the spectre of 'kstat data instability' will raise it's ugly head when/if this is brought before PSARC. I don't think that is a reason to not do this work, just something to be aware of. With any luck, the kstat data stability problem will be addressed in the near-term and we won't have to worry about it :) The kstat archive daemon ( kad? ) could potentially be a greedy storage consumer, so we will need to provide the appropriate knobs to tone down it's hunger ( sampling frequency, filtering rules ) as well as log rotation. I can also see making some changes to libkstat(3kstat) which could make this service more efficient ( asynchronous notification when the kstat chain changes version polling the chain for it's status for instance ). This suggestion is making my brain buzz. I like it. -ejo On Monday30 Nov, at 5:16 PM, Peter Tribble wrote: > I've been thinking about sar in the light of some experience within my > own organization and Garrett's EOF of sag. > > It's probably fair to say that sar is heavily used in some organizations > (and less so in others). We collect it anyway, so I've been doing some > data gathering based on it, which exposes some of its limitations: > > 1. No networking. This is an absolute killer. The concept of networking > seems to have been somewhat, ahem, neglected. > > 2. No NFS or fsstat statistics. > > 3. Missing metadata. For instance, sar shows me that there was I/O on > disk nfs12345, but fails to record what that might have referred to. > > 4. Generally missing data, and some data is only available in aggregate. > > One of the things I've been planning to add to JKstat for a long while is > the ability to read saved kstat -p output. (I'm essentially done with that, > and it works quite well.) Which got me wondering, and I then started to > ask myself: > > What if we just scrapped the current sar collector (sadc) and just saved > kstat -p output (or something like it) instead? > > The advantage of doing this are: > > - we gain all available statistics - network, nfs, fsstat, but also a whole > host of others > > - we could rerun the *stat utilities on the output (they would need to be > modified, of course, but you get the idea) > > - we aren't limited to the output predefined up front, we can massage > the data afterwards as we see fit > > - extensibility is free, if a new kstat is added we can easily get access > to its history > > - essentially any kstat consumer can be used to analyse the data > > It should be relatively easy to write a replacement for sar that could > read a kstat archive; you just do the aggregation and massage of > statistics when the data is displayed, rather than when its collected. > > Is this practical? How big is the data? > > On one of my machines, a thor, a regular sar datapoint is about 64k. > The kstat -p output is about a meg, but zip gets that down to 160k, and > 7z does significantly better. On a T5140, the sar data is about 20k and > kstat about 2 meg, so there the difference is much larger. But I feel the > idea has promise - in the best case we're almost there already, without > optimizing the saved format or trimming unnecessary data. > > Note that while I mentioned something like kstat -p, I'm not necessarily > thinking of that as the final format - you would want something more > compact, and wouldn't necessarily want to encode numbers as strings. > And also, I'm not necessarily thinking of just dumping the whole kstat > tree, although having looked at some systems I think you would end > up with most of it anyway, and I would be in favour of simply saving the > lot which would give you a lot more latitude in datamining. > > Thoughts, anyone? > > -- > -Peter Tribble > http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > _______________________________________________ > observability-discuss mailing list > [email protected] _______________________________________________ sysadmin-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss
