On Mon, Nov 30, 2009 at 5:16 PM, Peter Tribble <peter.trib...@gmail.com> wrote:
> I've been thinking about sar in the light of some experience within my
> own organization and Garrett's EOF of sag.
>
> It's probably fair to say that sar is heavily used in some organizations
> (and less so in others). We collect it anyway, so I've been doing some
> data gathering based on it, which exposes some of its limitations:
>
> 1. No networking. This is an absolute killer. The concept of networking
> seems to have been somewhat, ahem, neglected.
>
> 2. No NFS or fsstat statistics.
>
> 3. Missing metadata. For instance, sar shows me that there was I/O on
> disk nfs12345, but fails to record what that might have referred to.
>
> 4. Generally missing data, and some data is only available in aggregate.

Aside from reliability problems (that may be in the past) these are
all things that influenced me to abandon sar a long time ago.

> One of the things I've been planning to add to JKstat for a long while is
> the ability to read saved kstat -p output. (I'm essentially done with that,
> and it works quite well.) Which got me wondering, and I then started to
> ask myself:
>
> What if we just scrapped the current sar collector (sadc) and just saved
> kstat -p output (or something like it) instead?
>
> The advantage of doing this are:
>
>  - we gain all available statistics - network, nfs, fsstat, but also a whole
> host of others
>
>  - we could rerun the *stat utilities on the output (they would need to be
> modified, of course, but you get the idea)
>
>  - we aren't limited to the output predefined up front, we can massage
> the data afterwards as we see fit
>
>  - extensibility is free, if a new kstat is added we can easily get access
> to its history
>
>  - essentially any kstat consumer can be used to analyse the data
>
> It should be relatively easy to write a replacement for sar that could
> read a kstat archive; you just do the aggregation and massage of
> statistics when the data is displayed, rather than when its collected.
>
> Is this practical? How big is the data?
>
> On one of my machines, a thor, a regular sar datapoint is about 64k.
> The kstat -p output is about a meg, but zip gets that down to 160k, and
> 7z does significantly better. On a T5140, the sar data is about 20k and
> kstat about  2 meg, so there the difference is much larger. But I feel the
> idea has promise - in the best case we're almost there already, without
> optimizing the saved format or trimming unnecessary data.
>
> Note that while I mentioned something like kstat -p, I'm not necessarily
> thinking of that as the final format - you would want something more
> compact, and wouldn't necessarily want to encode numbers as strings.
> And also, I'm not necessarily thinking of just dumping the whole kstat
> tree, although having looked at some systems I think you would end
> up with most of it anyway, and I would be in favour of simply saving the
> lot which would give you a lot more latitude in datamining.
>
> Thoughts, anyone?

I like the general idea, but I wonder if there are enough
kstat-as-text consumers out there to make that an important target.
How about instead have a target of an SQLite database?  The really
nifty things about this approach could be:

- sqlite is already used in many places in OpenSolaris (SMF, firefox, ...)
- accessible to real languages through JDBC or ODBC
- accessible to shell scripts via sqlite command line
(/lib/svc/bin/sqlite today)
- easy to prune old data by deleting old rows
- easy to aggregate old data (prior to deletion) through SQL query
- could open up the doors to analysis of data leveraging the power of
SQL rather than complex application code.
- data already organized in a nice way for importing into an
enterprise performance analysis system.
- extensible beyond kstat data (dtrace aggregation written to DB?)

I'm sure that there are some other nifty things as well, but I'll
leave the identification of those to the reader.  :)

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
sysadmin-discuss mailing list
sysadmin-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss

Reply via email to