On 03.01.2012 17:56, Simon Riggs wrote:
On Tue, Jan 3, 2012 at 3:18 PM, Robert Haas<robertmh...@gmail.com>  wrote:

2. When a backend can't find a free buffer, it spins for a long time
while holding the lock. This makes the buffer strategy O(N) in its
worst case, which slows everything down. Notably, while this is
happening the bgwriter sits doing nothing at all, right at the moment
when it is most needed. The Clock algorithm is an approximation of an
LRU, so is already suboptimal as a "perfect cache". Tweaking to avoid
worst case behaviour makes sense. How much to tweak? Well,...

I generally agree with this analysis, but I don't think the proposed
patch is going to solve the problem.  It may have some merit as a way
of limiting the worst case behavior.  For example, if every shared
buffer has a reference count of 5, the first buffer allocation that
misses is going to have a lot of work to do before it can actually
come up with a victim.  But I don't think it's going to provide good
scaling in general.  Even if the background writer only spins through,
on average, ten or fifteen buffers before finding one to evict, that
still means we're acquiring ten or fifteen spinlocks while holding
BufFreelistLock. I don't currently have the measurements to prove
that's too expensive, but I bet it is.

I think its worth reducing the cost of scanning, but that has little
to do with solving the O(N) problem. I think we need both.

I've left the way open for you to redesign freelist management in many
ways. Please take the opportunity and go for it, though we must
realise that major overhauls require significantly more testing to
prove their worth. Reducing spinlocking only sounds like a good way to
proceed for this release.

If you don't have time in 9.2, then these two small patches are worth
having. The bgwriter locking patch needs less formal evidence to show
its worth. We simply don't need to have a bgwriter that just sits
waiting doing nothing.

I'd like to see some benchmarks that show a benefit from these patches, before committing something like this that complicates the code. These patches are fairly small, but nevertheless. Once we have a test case, we can argue whether the benefit we're seeing is worth the extra code, or if there's some better way to achieve it.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to