Simon Riggs <[EMAIL PROTECTED]> wrote:
> > "Zeugswetter Andreas DCP SD" <[EMAIL PROTECTED]> wrote:
> > > Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> > > tuple by reducing the tuple to it's header info.
> > Attached patch realizes the concept of his idea. The dead tuples will be
> > reduced to their headers are done by bgwriter.
> I'm interested in this patch but you need to say more about it. I get
> the general idea but it would be useful if you could give a full
> description of what this patch is trying to do and why.
OK, I try to explain the patch. Excuse me for a long writing.
The basic idea is just "reducing the dead tuple to it's header info",
suggested by Andreas. This is a lightweight per-page sweeping to reduce
the consumption of free space map and the necessity of VACUUM; i.e,
normal VACUUM is still needed occasionally.
I think it is useful on heavy-update workloads. It showed 5-10% of
performance improvement on DBT-2 after 9 hours running *without* vacuum.
I don't know whether it is still effective with well-scheduled vacuum.
* Why does bgwriter do vacuum?
Sweeping has cost, so non-backend process should do. Also, the page worth
vacuum are almost always dirty, because tuples on the page are just updated
or deleted. Bgwriter treats dirty pages, so I think it is a good place for
We must take super-exclusive-lock of the pages before vacuum. In the patch,
bgwriter tries to take exclusive-lock before it writes a page, and does
vacuum only if the lock is super-exclusive. Otherwise, it gives up and
writes the pages normally. This is an optimistic way, but I assume the
possibility is high because the most pages written by bgwriter are least
recently used (LRU).
* Keep the headers
We cannot remove dead tuples completely in per-page sweep, because
references to the tuples from indexes still remains. We might keep only
line pointers (4 bytes), but it might lead line-pointer-bloat problems,
so the headers (4+32 byte) should be left.
* Other twists and GUC variables in the patch
- Bgwriter cannot access the catalogs, so I added BM_RELATION hint bit
to BufferDesc. Only relation pages will be swept. This is enabled by
GUC variable 'bgvacuum_relation'.
- I changed bgwriter_lru_maxpages to be adjusted automatically. Backends
won't do vacuum not to disturb their processing, so bgwriter should write
most of dirty pages. ('bgvacuum_autotune')
- After sweepping, the page will be added to free space map. I made a simple
replacement algorithm of free space map, that replaces the page with least
spaces near the added one. ('bgvacuum_fsm')
- If WAL is produced by sweeping a page, writing the page should be pended
for a while, because flushing the WAL is needed before writing the page.
- Bgwriter writes pages in 4 contexts, background-writes for LRU, ALL,
checkpoint and shutdown. In current patch, pages are swept in 3 contexts
except shutdown, but it may be better to do only on LRU.
* Related discussions
- Real-Time Vacuum Possibility (Rod Taylor)
| have the bgwriter take a look at the pages it has, and see if it can do
| any vacuum work based on pages it is about to send to disk
- Pre-allocated free space for row updating (like PCTFREE) (Satoshi Nagayasu)
| light-weight repairing on a single page is needed to maintain free space
- Dead Space Map (Heikki Linnakangas)
| vacuuming pages one by one as they're written by bgwriter
Thank you for reading till the last.
I'd like to hear your comments.
NTT Cyber Space Laboratories
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?