Re: [HACKERS] Inserting heap tuples in bulk in COPY

Heikki Linnakangas Thu, 06 Oct 2011 04:34:13 -0700

On 25.09.2011 19:01, Robert Haas wrote:

On Wed, Sep 14, 2011 at 6:52 AM, Heikki Linnakangas
<[email protected]>  wrote:

Why do you need new WAL replay routines?  Can't you just use the existing
XLOG_HEAP_NEWPAGE support?


By any large, I think we should be avoiding special-purpose WAL entries
as much as possible.


I tried that, but most of the reduction in WAL-size melts away with that.
And if the page you're copying to is not empty, logging the whole page is
even more expensive. You'd need to fall back to retail inserts in that case
which complicates the logic.


Where does it go?  I understand why it'd be a problem for partially
filled pages, but it seems like it ought to be efficient for pages
that are initially empty.

A regular heap_insert record leaves out a lot of information that can bededuced at replay time. It can leave out all the headers, including justthe null bitmap + data. In addition to that, there's just the locationof the tuple (RelFileNode+ItemPointer). At replay, xmin is taken fromthe WAL record header.

For a multi-insert record, you don't even need to store the RelFileNodeand the block number for every tuple, just the offsets.

In comparison, a full-page image will include the full tuple header, andalso the line pointers. If I'm doing my math right, a full-page imagetakes 25 bytes more data per tuple, than the special-purposemulti-insert record.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inserting heap tuples in bulk in COPY

Reply via email to