On Thu, Feb 3, 2011 at 7:39 PM,  <da...@lang.hm> wrote:
>> Yeah, but you'll be passing the entire table through this separate
>> process that may only need to see 1% of it or less on a large table.
>> If you want to write the code and prove it's better than what we have
>> now, or some other approach that someone else may implement in the
>> meantime, hey, this is an open source project, and I like improvements
>> as much as the next guy.  But my prediction for what it's worth is
>> that the results will suck.  :-)
>
> I will point out that 1% of a very large table can still be a lot of disk
> I/O that is avoided (especially if it's random I/O that's avoided)

Sure, but I think that trying to avoid it will be costly in other ways
- you'll be streaming a huge volume of data through some auxiliary
process, which will have to apply some algorithm that's very different
from the one we use today.  The reality is that I think there's little
evidence that the way we do ANALYZE now is too expensive.  It's
typically very cheap and works very well.  It's a bit annoying when it
fires off in the middle of a giant data load, so we might need to
change the time of it a little, but if there's a problem with the
operation itself being too costly, this is the first I'm hearing of
it.  We've actually worked *really* hard to make it cheap.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to