Tom Lane wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
On every row, PageAddItem will scan all the line pointers on the target page, just to see that they're all in use, and create a new line pointer. That adds up, especially with narrow tuples like what I used in the test.
Attached is a fix for that.

This has been proposed before, and rejected before.  IIRC the previous
patch was quite a lot less invasive than this one (it didn't require
making special space on heap pages).  I don't recall why it wasn't
accepted.

Ahh, found that thread: http://archives.postgresql.org/pgsql-hackers/2005-07/msg00609.php

The main differences between that patch and mine is that
- the previous patch used an offset to the first free line pointer, and I used just a flag. - the previous patch stored the offset in the page header, and I used the special space

I think using the special space is a cleaner approach; the field is only meaningful in heap pages. However, now that I think of it, if we could squeeze the flag into one of the existing fields in the page header, we could put it there without decreasing the amount of space available for tuples. We could use the unused pd_tli field, as you suggested later in that thread.

At the end of the thread, Bruce added the patch to his hold-queue, but I couldn't find a trace of it after that so I'm not clear why it was rejected in the end. This comment (by you) seems most relevant:

I tried making a million-row table with just two int4 columns and then
duplicating it with CREATE TABLE AS SELECT.  In this context gprof
shows PageAddItem as taking 7% of the runtime, which your patch knocks
down to 1.5%.  This seems to be about the best possible real-world case,
though (the wider the rows, the fewer times PageAddItem can loop), and
so I'm still unconvinced that there's a generic gain here.  Adding an
additional word to page headers has a very definite cost --- we can
assume about a .05% increase in net I/O demands across *every*
application, whether they do a lot of inserts or not --- and so a
patch that provides a noticeable improvement in only a very small set
of circumstances is going to have to be rejected.

I believe the PageAddItem overhead has become more noticeable since then because of other improvements to COPY. In 8.3, we're also going to reduce the tuple length (combocids and the varvarlen thing), so we can fit more tuples per page, again making it slightly more significant.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to