Re: [HACKERS] Page replacement algorithm in buffer cache

Amit Kapila Tue, 09 Apr 2013 00:07:45 -0700


> -----Original Message-----
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Tuesday, April 09, 2013 9:28 AM
> To: Amit Kapila
> Cc: Greg Smith; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Page replacement algorithm in buffer cache
> 
> On Fri, Apr 5, 2013 at 11:08 PM, Amit Kapila <amit.kap...@huawei.com>
> wrote:
> > I still have one more doubt, consider the below scenario for cases
> when we
> > Invalidate buffers during moving to freelist v/s just move to
> freelist
> >
> >    Backend got the buffer from freelist for a request of page-9
> (number 9 is
> > random, just to explain), it still have association with another
> page-10
> >    It needs to add the buffer with new tag (new page association) in
> bufhash
> > table and remove the buffer with oldTag (old page association).
> >
> > The benefit for just moving to freelist is that if we get request of
> same
> > page until somebody else used it for another page, it will save read
> I/O.
> > However on the other side for many cases
> > Backend will need extra partition lock to remove oldTag (which can
> lead to
> > some bottleneck).
> >
> > I think saving read I/O is more beneficial but just not sure if that
> is best
> > as cases might be less for it.
> 
> I think saving read I/O is a lot more beneficial.  I haven't seen
> evidence of a severe bottleneck updating the buffer mapping tables.  I
> have seen some evidence of spinlock-level contention on read workloads
> that fit in shared buffers, because in that case the system can run
> fast enough for the spinlocks protecting the lwlocks to get pretty
> hot.  But if you're doing writes, or if the workload doesn't fit in
> shared buffers, other bottlenecks slow you down enough that this
> doesn't really seem to become much of an issue.
> 
> Also, even if you *can* find some scenario where pushing the buffer
> invalidation into the background is a win, I'm not convinced that
> would justify doing it, because the case where it's a huge loss -
> namely, working set just a tiny bit smaller than shared_buffers - is
> pretty obvious. I don't think we dare fool around with that; the
> townspeople will arrive with pitchforks.
> 
> I believe that the big win here is getting the clock sweep out of the
> foreground so that BufFreelistLock doesn't catch fire.  The buffer
> mapping locks are partitioned and, while it's not like that completely
> gets rid of the contention, it sure does help a lot.  So I would view
> that goal as primary, at least for now.  If we get a first round of
> optimization done in this area, that doesn't preclude further
> improving it in the future.


I agree with you that this can be first step towards improvement.

> > Last time following tests have been executed to validate the results:
> >
> > Test suite - pgbench
> > DB Size - 16 GB
> > RAM     - 24 GB
> > Shared Buffers - 2G, 5G, 7G, 10G
> > Concurrency - 8, 16, 32, 64 clients
> > Pre-warm the buffers before start of test
> >
> > Shall we try for any other scenario's or for initial test of patch
> above are
> > okay.
> 
> Seems like a reasonable place to start.

I shall work on this for first CF of 9.4.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Page replacement algorithm in buffer cache

Reply via email to