On Thu, 23 Aug 2007, Tom Lane wrote:
It is doubtless true in a lightly loaded system, but once the kernel is
under any kind of memory pressure I think it's completely wrong.
The fact that so many tests I've done or seen get maximum throughput in
terms of straight TPS with the background writer turned completely off is
why I stated that so explicitly. I understand what you're saying in terms
of memory pressure, all I'm suggesting is that the empirical tests suggest
the current background writer even with moderate improvements doesn't
necessarily help when you get there. If writes are blocking, whether the
background writer does them slightly ahead of time or whether the backend
does them itself doesn't seem to matter very much. On a heavily loaded
system, your throughput is bottlenecked at the disk either way--and
therefore it's all the more important in those cases to never do a write
until you absolutely have to, lest it be wasted.
If you're still fiddling with it then you probably aren't going to get
it right in the next few days.
The implementation is fine most of the time, I've just found some corner
cases in testing I'd like to improve stability on (mainly how best to
handle when no buffers were allocated during the previous period, some
small concerns about the first pass over the pool). What I'm thinking of
doing is taking a couple of my assumptions/techniques and turning them
into things that can be turned on or off with #DEFINE, that way the parts
of the code that people don't like are easy to identify and pull out.
I've already done with that with one section.
Maybe you need to put back the eliminated tuning parameter in the form
of the scaling factor to be used here. I don't like 1.0, mainly because
I don't believe your assumption (2). I'm willing to concede that 2.0
might be too much, but I don't know where in between is the sweet spot.
That would be easy to implement and add some flexibility, so I'll do that.
bgwriter_lru_percent becomes bgwriter_lru_multiplier, possibly to be
renamed later if someone comes up with a snappier name.
Also, we might need a tuning parameter for the reaction speed of the
moving average --- what are you using for that?
It's hard-coded at 16 samples. Seemed stable around 10-20, picked 16 in
so maybe it will optimize usefully to a bit shift. On the reaction side,
it actually reacts faster than that--if the most recent allocation is
greater than the average, it uses that instead. The number of samples has
more of an impact on the trailing side, and accordingly isn't that
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings