Tom Lane wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
On every row, PageAddItem will scan all the line pointers on the target
page, just to see that they're all in use, and create a new line
pointer. That adds up, especially with narrow tuples like what I used in
the test.
Attached is a fix for that.
This has been proposed before, and rejected before. IIRC the previous
patch was quite a lot less invasive than this one (it didn't require
making special space on heap pages). I don't recall why it wasn't
accepted.
Ahh, found that thread:
http://archives.postgresql.org/pgsql-hackers/2005-07/msg00609.php
The main differences between that patch and mine is that
- the previous patch used an offset to the first free line pointer, and
I used just a flag.
- the previous patch stored the offset in the page header, and I used
the special space
I think using the special space is a cleaner approach; the field is only
meaningful in heap pages. However, now that I think of it, if we could
squeeze the flag into one of the existing fields in the page header, we
could put it there without decreasing the amount of space available for
tuples. We could use the unused pd_tli field, as you suggested later in
that thread.
At the end of the thread, Bruce added the patch to his hold-queue, but I
couldn't find a trace of it after that so I'm not clear why it was
rejected in the end. This comment (by you) seems most relevant:
I tried making a million-row table with just two int4 columns and then
duplicating it with CREATE TABLE AS SELECT. In this context gprof
shows PageAddItem as taking 7% of the runtime, which your patch knocks
down to 1.5%. This seems to be about the best possible real-world case,
though (the wider the rows, the fewer times PageAddItem can loop), and
so I'm still unconvinced that there's a generic gain here. Adding an
additional word to page headers has a very definite cost --- we can
assume about a .05% increase in net I/O demands across *every*
application, whether they do a lot of inserts or not --- and so a
patch that provides a noticeable improvement in only a very small set
of circumstances is going to have to be rejected.
I believe the PageAddItem overhead has become more noticeable since then
because of other improvements to COPY. In 8.3, we're also going to
reduce the tuple length (combocids and the varvarlen thing), so we can
fit more tuples per page, again making it slightly more significant.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend