> -----Original Message----- > From: Robert Haas [mailto:robertmh...@gmail.com] > Sent: Tuesday, April 09, 2013 9:28 AM > To: Amit Kapila > Cc: Greg Smith; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Page replacement algorithm in buffer cache > > On Fri, Apr 5, 2013 at 11:08 PM, Amit Kapila <amit.kap...@huawei.com> > wrote: > > I still have one more doubt, consider the below scenario for cases > when we > > Invalidate buffers during moving to freelist v/s just move to > freelist > > > > Backend got the buffer from freelist for a request of page-9 > (number 9 is > > random, just to explain), it still have association with another > page-10 > > It needs to add the buffer with new tag (new page association) in > bufhash > > table and remove the buffer with oldTag (old page association). > > > > The benefit for just moving to freelist is that if we get request of > same > > page until somebody else used it for another page, it will save read > I/O. > > However on the other side for many cases > > Backend will need extra partition lock to remove oldTag (which can > lead to > > some bottleneck). > > > > I think saving read I/O is more beneficial but just not sure if that > is best > > as cases might be less for it. > > I think saving read I/O is a lot more beneficial. I haven't seen > evidence of a severe bottleneck updating the buffer mapping tables. I > have seen some evidence of spinlock-level contention on read workloads > that fit in shared buffers, because in that case the system can run > fast enough for the spinlocks protecting the lwlocks to get pretty > hot. But if you're doing writes, or if the workload doesn't fit in > shared buffers, other bottlenecks slow you down enough that this > doesn't really seem to become much of an issue. > > Also, even if you *can* find some scenario where pushing the buffer > invalidation into the background is a win, I'm not convinced that > would justify doing it, because the case where it's a huge loss - > namely, working set just a tiny bit smaller than shared_buffers - is > pretty obvious. I don't think we dare fool around with that; the > townspeople will arrive with pitchforks. > > I believe that the big win here is getting the clock sweep out of the > foreground so that BufFreelistLock doesn't catch fire. The buffer > mapping locks are partitioned and, while it's not like that completely > gets rid of the contention, it sure does help a lot. So I would view > that goal as primary, at least for now. If we get a first round of > optimization done in this area, that doesn't preclude further > improving it in the future.
I agree with you that this can be first step towards improvement. > > Last time following tests have been executed to validate the results: > > > > Test suite - pgbench > > DB Size - 16 GB > > RAM - 24 GB > > Shared Buffers - 2G, 5G, 7G, 10G > > Concurrency - 8, 16, 32, 64 clients > > Pre-warm the buffers before start of test > > > > Shall we try for any other scenario's or for initial test of patch > above are > > okay. > > Seems like a reasonable place to start. I shall work on this for first CF of 9.4. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers