On Wed, 2007-09-05 at 23:31 -0400, Greg Smith wrote:
> Tom gets credit for naming the attached patch, which is my latest attempt to
> finalize what has been called the "Automatic adjustment of
> bgwriter_lru_maxpages" patch for 8.3; that's not what it does anymore but
> that's where it started.
This is a big undertaking, so well done for going for it.
> I decided to use pgbench for running my tests. The scripting framework to
> collect all that data and usefully summarize it is now available as
> pgbench-tools-0.2 at
For me, the main role of the bgwriter is to avoid dirty writes in
backends. The purpose of doing that is to improve the response time
distribution as perceived by users. I think that is what we should be
measuring, perhaps in a simple way such as calculating the 90th
percentile of the response time distribution. Looking at the "headline
numbers" especially tps is notoriously difficult to determine any
meaning from test results.
Looking at the tps also tempts us to run a test which maxes out the
server, an area we already know and expect the bgwriter to be unhelpful
If I run a server at or below 70% capacity, what settings of the
bgwriter help maintain my response time distribution?
> Coping with idle periods
> While I was basically happy with these results, the data Kevin Grittner
> submitted in response to my last call for commentary left me concerned. While
> the JIT approach works fine as long as your system is active, it does
> absolutely nothing if the system is idle. I noticed that a lot of the writes
> that were being done by the client backends were after idle periods where the
> JIT writer just didn't react fast enough during the ramp-up. For example, if
> the system went from idle for a while to full-speed just as the 200ms sleep
> started, by the time the BGW woke up again the backends could have needed to
> write many buffers already themselves.
You've hit the nail on the head there. I can't see how you can do
anything sensible when the bgwriter keeps going to sleep for long
The bgwriter's activity curve should ideally be the same shape as a
critically damped harmonic oscillator. It should wake up, lots of
writing if needed, then trail off over time. The only way to do that
seems to be to vary the sleep automatically, or make short sleeps.
For me, the bgwriter should sleep for at most 10ms at a time. If it has
nothing to do it can go straight back to sleep again. Trying to set that
time is fairly difficult, so it would be better not to have to set it at
If you've changed bgwriter so it doesn't scan if no blocks have been
allocated, I don't see any reason to keep the _delay parameter at all.
> I think I can safely say there is a level of intelligence going into what the
> LRU background writer does with this patch that has never been applied to
> problem before. There have been a lot of good ideas thrown out in this area,
> but it took a hybrid approach that included and carefully balanced all of
> to actually get results that I felt were usable. What I don't know is whether
> that will also be true for other testers.
I get the feeling that what we have here is better than what we had
before, but I guess I'm a bit disappointed we still have 3 magic
parameters, or 5 if you count your hard-coded ones also.
There's still no formal way to tune these. As long as we have *any*
magic parameters, we need a way to tune them in the field, or they are
useless. At very least we need a plan for how people will report results
during Beta. That means we need a log_bgwriter (better name, please...)
parameter that provides information to assist with tuning. At the very
least we need this to be present during Beta, if not beyond.
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at