On 6/28/13 8:50 AM, Robert Haas wrote:
On Fri, Jun 28, 2013 at 12:52 AM, Amit Kapila <amit.kap...@huawei.com> wrote:
4. Separate processes for writing dirty buffers and moving buffers to

I think this part might be best pushed to a separate patch, although I
agree we probably need it.

This might be necessary eventually, but it's going to make thing more complicated. And I don't think it's a blocker for creating something useful. The two most common workloads are:

1) Lots of low usage count data, typically data that is updated sparsely across a larger database. These are helped by a process that writes dirty buffers in the background. These benefit from the current background writer. Kevin's system he was just mentioning again is the best example of this type that there's public data on.

2) Lots of high usage count data, because there are large hotspots in things like index blocks. Most writes happen at checkpoint time, because the background writer won't touch them. Because there are only a small number of re-usable pages, the clock sweep goes around very fast looking for them. This is the type of workload that should benefit from putting buffers into the free list. pgbench provides a simple example of this type, which is why Amit's tests using it have been useful.

If you had a process that tried to handle both background writes and freelist management, I suspect one path would be hot and the other almost idle in each type of system. I don't expect that splitting those into two separate process would buy a lot of value, that can easily be pushed to a later patch.

The background writer would just
have a high and a low watermark.  When the number of buffers on the
freelist drops below the low watermark, the allocating backend sets
the latch and bgwriter wakes up and begins adding buffers to the
freelist.  When the number of buffers on the free list reaches the
high watermark, the background writer goes back to sleep.

This will work fine for all of the common workloads. The main challenge is keeping the buffer allocation counting from turning into a hotspot. Busy systems now can easily hit 100K buffer allocations/second. I'm not too worried about it because those allocations are making the free list lock a hotspot right now.

One of the consistently controversial parts of the current background writer is how it tries to loop over the buffer cache every 2 minutes, regardless of activity level. The idea there was that on bursty workloads, buffers would be cleaned during idle periods with that mechanism. Part of why that's in there is to deal with the relatively long pause between background writer runs.

This refactoring idea will make that hard to keep around. I think this is OK though. Switching to a latch based design should eliminate the bgwriter_delay, which means you won't have this worst case of a 200ms stall while heavy activity is incoming.

Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to