Re: [sysadmin-discuss] [observability-discuss] Rethinking sar

Erik O'Shaughnessy Tue, 01 Dec 2009 19:37:49 -0800

Peter,

I like where you are going with this!

Most of my thoughts about your proposal are implementation specific and revolve 
around not using 'kstat -p' output.  While useful for prototyping or 
illustrating the idea, there are some limitations inherent in kstat(1M) that 
ought to be addressed.  Foremost is the current lack of support for 64-bit 
integers in the Kstat(3Perl) module, which is why kstat fields like 'snaptime' 
are expressed as a floating-point value instead of an integer.   I haven't been 
following perl actively for some time, so I don't know if this is a current 
limitation or just an older one baked into Kstat(3Perl).

I think a successful kstat stream archive format ( kar? ) will need some 
ancillary record types in addition to the kstats; eg time base records to help 
give meaning to individual kstat snaptime/crtime  values, hardware/software 
version information, and so on.  I can also see arguments for a binary 
recording format versus a human-readable format.  I would encourage XDR as the 
binary storage format and providing a suite of tools to transform the archive 
into whatever format the end-user may desire (XML, SQL ready tables, etc ).  
XDR is mature, portable, reasonably fast and has a plethora of language 
bindings which would support building the tool suite component (and I've 
already written a kstat XDR filter! ). 

Since the data could potentially be generated on one host and consumed on 
another host, the spectre of 'kstat data instability' will raise it's ugly head 
when/if this is brought before PSARC.  I don't think that is a reason to not do 
this work, just something to be aware of.   With any luck, the kstat data 
stability problem will be addressed in the near-term and we won't have to worry 
about it :)

The kstat archive daemon ( kad? ) could potentially be a greedy storage 
consumer, so we will need to provide the appropriate knobs to tone down it's 
hunger ( sampling frequency, filtering rules ) as well as log rotation.  I can 
also see making some changes to libkstat(3kstat) which could make this service 
more efficient ( asynchronous notification when the kstat chain changes version 
polling  the chain for it's status for instance ).

This suggestion is making my brain buzz.  I like it. 

-ejo

On Monday30 Nov, at 5:16 PM, Peter Tribble wrote:

> I've been thinking about sar in the light of some experience within my
> own organization and Garrett's EOF of sag.
> 
> It's probably fair to say that sar is heavily used in some organizations
> (and less so in others). We collect it anyway, so I've been doing some
> data gathering based on it, which exposes some of its limitations:
> 
> 1. No networking. This is an absolute killer. The concept of networking
> seems to have been somewhat, ahem, neglected.
> 
> 2. No NFS or fsstat statistics.
> 
> 3. Missing metadata. For instance, sar shows me that there was I/O on
> disk nfs12345, but fails to record what that might have referred to.
> 
> 4. Generally missing data, and some data is only available in aggregate.
> 
> One of the things I've been planning to add to JKstat for a long while is
> the ability to read saved kstat -p output. (I'm essentially done with that,
> and it works quite well.) Which got me wondering, and I then started to
> ask myself:
> 
> What if we just scrapped the current sar collector (sadc) and just saved
> kstat -p output (or something like it) instead?
> 
> The advantage of doing this are:
> 
> - we gain all available statistics - network, nfs, fsstat, but also a whole
> host of others
> 
> - we could rerun the *stat utilities on the output (they would need to be
> modified, of course, but you get the idea)
> 
> - we aren't limited to the output predefined up front, we can massage
> the data afterwards as we see fit
> 
> - extensibility is free, if a new kstat is added we can easily get access
> to its history
> 
> - essentially any kstat consumer can be used to analyse the data
> 
> It should be relatively easy to write a replacement for sar that could
> read a kstat archive; you just do the aggregation and massage of
> statistics when the data is displayed, rather than when its collected.
> 
> Is this practical? How big is the data?
> 
> On one of my machines, a thor, a regular sar datapoint is about 64k.
> The kstat -p output is about a meg, but zip gets that down to 160k, and
> 7z does significantly better. On a T5140, the sar data is about 20k and
> kstat about  2 meg, so there the difference is much larger. But I feel the
> idea has promise - in the best case we're almost there already, without
> optimizing the saved format or trimming unnecessary data.
> 
> Note that while I mentioned something like kstat -p, I'm not necessarily
> thinking of that as the final format - you would want something more
> compact, and wouldn't necessarily want to encode numbers as strings.
> And also, I'm not necessarily thinking of just dumping the whole kstat
> tree, although having looked at some systems I think you would end
> up with most of it anyway, and I would be in favour of simply saving the
> lot which would give you a lot more latitude in datamining.
> 
> Thoughts, anyone?
> 
> -- 
> -Peter Tribble
> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
> _______________________________________________
> observability-discuss mailing list
> [email protected]

_______________________________________________
sysadmin-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss

Re: [sysadmin-discuss] [observability-discuss] Rethinking sar

Reply via email to