On Sun, Dec 13, 2009 at 5:19 PM, Rainer Heilke <[email protected]> wrote:
> Peter Tribble wrote:
>>
>> On Thu, Dec 10, 2009 at 5:29 AM, Mike Gerdts <[email protected]> wrote:
>>>
>>> Or rrdtool consumes the data instantly, but the raw data is kept
>>> around for a bit.
>>
>> You're heading a little further than I was originally. I was originally
>> only
>> looking at the very bottom layer of the stack - just dumping enough raw
>> data both regularly enough and sufficiently completely that a range of
>> higher-level tools had something to chew on.
>>
>> My experience here is that munging data into rrd is relatively expensive,
>> at least on the scale we're looking at here. I suspect that for rrd
>> collection
>> you would have to identify the subset of statistics of interest, and just
>> keep
>> those. Or are you suggesting we rrd everything? (That won't work for any
>> meaningful definition of everything: just consider the I/O statistics for
>> NFS
>> mounts in an environment with an active automounter.) And if just a
>> subset, can we identify that?
>
> First off, I don't want a subset. I want everything of value we can gather.
> Your comments on load and difficulty are adding to Mike's about what rrd
> does to the data to make me like it less and less.

Everything of value is likely too broad.  This is because much of what
may be of value in the future just looks like junk today.  For
example, is unix:0:DelegStateID_entry_cache:buf_constructed of value?
Will it ever be of value?  How do you know?  Will it even exist after
the next patch?

Everything is likely too big as well:

# ptime kstat -p | wc

real        2.850
user        2.739
sys         0.107
   37296   75904 1296131

That is, on a T2000 with several zones, it took almost three seconds
to get all the values from the system.  It generated 37,296 rows of
data and would occupy 1,296,131 bytes of disk space.  It sure would be
nice to prefix each of those lines with the date and time for dead
simple parsing.  Ooops, that would add another 780 KB using ISO 8601
date-time stamps.

Remember that in my example with sqlite every row was time stamped to
the second and that time stamp took only 4 bytes, compared to 20 bytes
for IS0 8601.  (We'd have go to 8 bytes by 2038, assuming Apophis
doesn't do us in first.)

> Secondly, I was only debating the rrd usefulness as an on-disk storage
> format, since the comment was made that text would get overly large and
> expensive to parse.
>
> But yes, we should probably focus first on what to collect and how to get
> it.

Agreed.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
sysadmin-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss

Reply via email to