On 6/28/13 8:50 AM, Robert Haas wrote:
On Fri, Jun 28, 2013 at 12:52 AM, Amit Kapila <amit.kap...@huawei.com> wrote:
4. Separate processes for writing dirty buffers and moving buffers to
I think this part might be best pushed to a separate patch, although I
agree we probably need it.
This might be necessary eventually, but it's going to make thing more
complicated. And I don't think it's a blocker for creating something
useful. The two most common workloads are:
1) Lots of low usage count data, typically data that is updated sparsely
across a larger database. These are helped by a process that writes
dirty buffers in the background. These benefit from the current
background writer. Kevin's system he was just mentioning again is the
best example of this type that there's public data on.
2) Lots of high usage count data, because there are large hotspots in
things like index blocks. Most writes happen at checkpoint time,
because the background writer won't touch them. Because there are only
a small number of re-usable pages, the clock sweep goes around very fast
looking for them. This is the type of workload that should benefit from
putting buffers into the free list. pgbench provides a simple example
of this type, which is why Amit's tests using it have been useful.
If you had a process that tried to handle both background writes and
freelist management, I suspect one path would be hot and the other
almost idle in each type of system. I don't expect that splitting those
into two separate process would buy a lot of value, that can easily be
pushed to a later patch.
The background writer would just
have a high and a low watermark. When the number of buffers on the
freelist drops below the low watermark, the allocating backend sets
the latch and bgwriter wakes up and begins adding buffers to the
freelist. When the number of buffers on the free list reaches the
high watermark, the background writer goes back to sleep.
This will work fine for all of the common workloads. The main challenge
is keeping the buffer allocation counting from turning into a hotspot.
Busy systems now can easily hit 100K buffer allocations/second. I'm not
too worried about it because those allocations are making the free list
lock a hotspot right now.
One of the consistently controversial parts of the current background
writer is how it tries to loop over the buffer cache every 2 minutes,
regardless of activity level. The idea there was that on bursty
workloads, buffers would be cleaned during idle periods with that
mechanism. Part of why that's in there is to deal with the relatively
long pause between background writer runs.
This refactoring idea will make that hard to keep around. I think this
is OK though. Switching to a latch based design should eliminate the
bgwriter_delay, which means you won't have this worst case of a 200ms
stall while heavy activity is incoming.
Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: