ITAGAKI Takahiro wrote:
"Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
Perhaps it would be better to have the bgwriter take a look at how many
dead tuples (or how much space the dead tuples account for) when it
writes a page out and adjust the DSM at that time.
Yeah, I feel it is worth optimizable, too. One question is, how we treat
dirty pages written by backends not by bgwriter? If we want to add some
works in bgwriter, do we also need to make bgwriter to write almost of
dirty pages?
IMO yes, we want the bgwriter to be the only process that's normally
writing pages out. How close we are to that, I don't know...
I'm working on making the bgwriter to write almost of dirty pages. This is
the proposal for it using automatic adjustment of bgwriter_lru_maxpages.
The bgwriter_lru_maxpages value will be adjusted to the equal number of calls
of StrategyGetBuffer() per cycle with some safety margins (x2 at present).
The counter are incremented per call and reset to zero at StrategySyncStart().
This patch alone is not so useful except for hiding hardly tunable parameters
from users. However, it would be a first step of allow bgwriters to do some
works before writing dirty buffers.
- [DSM] Pick out pages worth vaccuming and register them into DSM.
- [HOT] Do a per page vacuum for HOT updated tuples. (Is it worth doing?)
- [TODO Item] Shrink expired COLD updated tuples to just their headers.
- Set commit hint bits to reduce subsequent writes of blocks.
http://archives.postgresql.org/pgsql-hackers/2007-01/msg01363.php
I tested the attached patch on pgbench -s5 (80MB) with shared_buffers=32MB.
I got an expected result as below. Over 75% of buffers are written by
bgwriter. In addition , automatic adjusted bgwriter_lru_maxpages values
were much higher than the default value (5). It shows that the most suitable
values greatly depends on workloads.
benchmark | throughput | cpu-usage | by-bgwriter | bgwriter_lru_maxpages
------------+------------+-----------+-------------+-----------------------
default | 300tps | 100% | 77.5% | 120 pages/cycle
with sleep | 150tps | 50% | 98.6% | 70 pages/cycle
I hope that this patch will be a first step of the intelligent bgwriter.
Comments welcome.
The general approach looks good to me. I'm queuing some benchmarks to
see how effective it is with a fairly constant workload.
This change in bgwriter.c looks fishy:
*************** BackgroundWriterMain(void)
*** 484,491 ****
*
* We absorb pending requests after each short sleep.
*/
! if ((bgwriter_all_percent > 0.0 && bgwriter_all_maxpages > 0) ||
! (bgwriter_lru_percent > 0.0 && bgwriter_lru_maxpages >
0))
udelay = BgWriterDelay * 1000L;
else if (XLogArchiveTimeout > 0)
udelay = 1000000L; /* One second */
--- 484,490 ----
*
* We absorb pending requests after each short sleep.
*/
! if (bgwriter_all_percent > 0.0 && bgwriter_all_maxpages > 0)
udelay = BgWriterDelay * 1000L;
else if (XLogArchiveTimeout > 0)
udelay = 1000000L; /* One second */
Doesn't that mean that bgwriter only runs every 1 or 10 seconds,
regardless of bgwriter_delay, if bgwriter_all_* parameters are not set?
The algorithm used to update bgwriter_lru_maxpages needs some thought.
Currently, it's decreased by one when less clean pages were required by
backends than expected, and increased otherwise. Exponential smoothing
or something similar seems like the natural choice to me.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly