Re: [HACKERS] our buffer replacement strategy is kind of lame

Robert Haas Fri, 12 Aug 2011 05:26:44 -0700

On Fri, Aug 12, 2011 at 4:36 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
> On Fri, Aug 12, 2011 at 5:05 AM, Robert Haas <robertmh...@gmail.com> wrote:
>
>> On
>> the other hand, the buffer manager has *no problem at all* trashing
>> the buffer arena if we're faulting in pages for an index scan rather
>> than a sequential scan.  If you manage to get all of sample_data into
>> memory (by running many copies of the above query in parallel, you can
>> get each one to allocate its own ring buffer, and eventually pull in
>> all the pages), and then run some query that probes an index which is
>> too large to fit in shared_buffers, it cheerfully blows the whole
>> sample_data table out without a second thought.  Had you sequentially
>> scanned a big table, of course, there would be some protection, but an
>> index scan can stomp all over everything with complete impunity.
>
> That's a good observation and I think we should do this
>
> * Make an IndexScan use a ring buffer once it has used 32 blocks. The
> vast majority won't do that, so we avoid overhead on the common path.
>
> * Make an BitmapIndexScan use a ring buffer when we know that the
> index is larger than 32 blocks. (Ignore upper parts of tree for that
> calc).


We'd need to think about what happens to the internal pages of the
btree, the leaf pages, and then the heap pages from the underlying
relation; those probably shouldn't all be treated the same.  Also, I
think the tricky part is figuring out when to apply the optimization
in the first place.  Once we decide we need a ring buffer, a very
small one (like 32 blocks) is probably the way to go.  But it will be
a loser to apply the optimization to data sets that would otherwise
have fit in shared_buffers.  This is a classic case of the LRU/MRU
problem.  You want to evict buffers in LRU fashion until the working
set gets larger than you can cache; and then you want to switch to MRU
to avoid uselessly caching pages that you'll never manage to revisit
before they're evicted.

The point of algorithms like Clock-Pro is to try to have the system
work that out on the fly, based on the actual workload, rather than
using heuristics.  I agree with you there's no convincing evidence
that Clock-Pro would be better for us; I mostly thought it was
interesting because it seems that the NetBSD and Linux guys find it
interesting, and they're managing much larger caches than the ones
we're dealing with.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] our buffer replacement strategy is kind of lame

Reply via email to