Re: [sysadmin-discuss] [observability-discuss] Rethinking sar

Peter Tribble Sat, 05 Dec 2009 11:14:34 -0800

On Tue, Dec 1, 2009 at 8:42 PM, Erik O'Shaughnessy
<[email protected]> wrote:
> Peter,
>
> I like where you are going with this!
>
> Most of my thoughts about your proposal are implementation specific and 
> revolve around not using 'kstat -p' output.  While useful for prototyping or 
> illustrating the idea, there are some limitations inherent in kstat(1M) that 
> ought to be addressed.  Foremost is the current lack of support for 64-bit 
> integers in the Kstat(3Perl) module, which is why kstat fields like 
> 'snaptime' are expressed as a floating-point value instead of an integer.   I 
> haven't been following perl actively for some time, so I don't know if this 
> is a current limitation or just an older one baked into Kstat(3Perl).


I was only thinking of kstat -p output as prototype/illustration. Although
being able to munge the output with the normal awk/sed/grep/perl and
chuck it straight into one's plotting package of choice does have some
appeal!

(Note that jkstat has some of the same data restrictions - while it is
possible to deal with unsigned 64-bit data in java, it's a hack I've so
far left brushed under the carpet.)

> I think a successful kstat stream archive format ( kar? )

Dang! I've already appropriated kar.

> will need some ancillary record types in addition to the kstats; eg time base 
> records to help give meaning to individual kstat snaptime/crtime  values, 
> hardware/software version information, and so on.  I can also see arguments 
> for a binary recording format versus a human-readable format.  I would 
> encourage XDR as the binary storage format and providing a suite of tools to 
> transform the archive into whatever format the end-user may desire (XML, SQL 
> ready tables, etc ).  XDR is mature, portable, reasonably fast and has a 
> plethora of language bindings which would support building the tool suite 
> component (and I've already written a kstat XDR filter! ).

Care to share? I don't have any XDR code of my own to even start with.

Although, if I were to do something along those lines, as opposed to a
straight binary dump or text-readable kstat -p output, I would probably
look very hard at Apache Avro.

(I assume you're thinking about allowing extension to an over the wire
rpc protocol. Or perhaps the other way.)

> Since the data could potentially be generated on one host and consumed on 
> another host, the spectre of 'kstat data instability' will raise it's ugly 
> head when/if this is brought before PSARC.  I don't think that is a reason to 
> not do this work, just something to be aware of.   With any luck, the kstat 
> data stability problem will be addressed in the near-term and we won't have 
> to worry about it :)

I don't see any issues here. I'm not suggesting anything more than
many of us do with kstat data at present. (And, to be honest, I'm not
sure that an answer to the kstat stability problem can simultaneously
be useful and pass ARC muster.)

> The kstat archive daemon ( kad? ) could potentially be a greedy storage 
> consumer, so we will need to provide the appropriate knobs to tone down it's 
> hunger ( sampling frequency, filtering rules ) as well as log rotation.

I was thinking more of using cron for some of that.

Keeping the storage under control is clearly going to be something that's
going to need quite a bit of thought. One of my aims is to actually have
very much more data to chew on, so that storage would naturally be
expected to increase.

There are some options, a few of which might be:

 - Only save the processed data that sar uses. Compatible, but useless.
 - Only save the data that's actually used by sar, but in raw form. This
gains you essentially nothing, as sar actually reads most of the kstats
anyway
 - Compress the data up the wazoo.
 - Compress in the time domain, so you don't keep saving the kstats
that don't change. (A quick test on my desktop - only 10% or so of
the statistics actually change over an hour.)
 - Process old data (like rrd) so that time resolution of older data
is decreased.

However, I think that if this turns out to be useful, then it will be seen
to be much less of an issue. And I can see users wanting to increase
the sampling rate to get at more detail.

>  I can also see making some changes to libkstat(3kstat) which could make this 
> service more efficient ( asynchronous notification when the kstat chain 
> changes version polling  the chain for it's status for instance ).
>
> This suggestion is making my brain buzz.  I like it.

Thanks!

> On Monday30 Nov, at 5:16 PM, Peter Tribble wrote:
>
>> I've been thinking about sar in the light of some experience within my
>> own organization and Garrett's EOF of sag.
>>
>> It's probably fair to say that sar is heavily used in some organizations
>> (and less so in others). We collect it anyway, so I've been doing some
>> data gathering based on it, which exposes some of its limitations:
>>
>> 1. No networking. This is an absolute killer. The concept of networking
>> seems to have been somewhat, ahem, neglected.
>>
>> 2. No NFS or fsstat statistics.
>>
>> 3. Missing metadata. For instance, sar shows me that there was I/O on
>> disk nfs12345, but fails to record what that might have referred to.
>>
>> 4. Generally missing data, and some data is only available in aggregate.
>>
>> One of the things I've been planning to add to JKstat for a long while is
>> the ability to read saved kstat -p output. (I'm essentially done with that,
>> and it works quite well.) Which got me wondering, and I then started to
>> ask myself:
>>
>> What if we just scrapped the current sar collector (sadc) and just saved
>> kstat -p output (or something like it) instead?
>>
>> The advantage of doing this are:
>>
>> - we gain all available statistics - network, nfs, fsstat, but also a whole
>> host of others
>>
>> - we could rerun the *stat utilities on the output (they would need to be
>> modified, of course, but you get the idea)
>>
>> - we aren't limited to the output predefined up front, we can massage
>> the data afterwards as we see fit
>>
>> - extensibility is free, if a new kstat is added we can easily get access
>> to its history
>>
>> - essentially any kstat consumer can be used to analyse the data
>>
>> It should be relatively easy to write a replacement for sar that could
>> read a kstat archive; you just do the aggregation and massage of
>> statistics when the data is displayed, rather than when its collected.
>>
>> Is this practical? How big is the data?
>>
>> On one of my machines, a thor, a regular sar datapoint is about 64k.
>> The kstat -p output is about a meg, but zip gets that down to 160k, and
>> 7z does significantly better. On a T5140, the sar data is about 20k and
>> kstat about  2 meg, so there the difference is much larger. But I feel the
>> idea has promise - in the best case we're almost there already, without
>> optimizing the saved format or trimming unnecessary data.
>>
>> Note that while I mentioned something like kstat -p, I'm not necessarily
>> thinking of that as the final format - you would want something more
>> compact, and wouldn't necessarily want to encode numbers as strings.
>> And also, I'm not necessarily thinking of just dumping the whole kstat
>> tree, although having looked at some systems I think you would end
>> up with most of it anyway, and I would be in favour of simply saving the
>> lot which would give you a lot more latitude in datamining.
>>
>> Thoughts, anyone?
>>
>> --
>> -Peter Tribble
>> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>> _______________________________________________
>> observability-discuss mailing list
>> [email protected]
>
>



-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
_______________________________________________
sysadmin-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss

Re: [sysadmin-discuss] [observability-discuss] Rethinking sar

Reply via email to