On Aug 7, 2007, at 6:13 PM, Mark Makarowsky wrote:

Can you provide more detail on what you mean by your
two suggestions below:

Yeah, I've used "vertical partitioning" very
successfully in the past, though I've never done it
for just a single field. I'll typically leave the few
most common fields in the "main" table and pull
everything else into a second table.

Vertical partitioning is where you split up your table on disk by columns, i.e on the vertical lines. He quoted it because Postgres doesn't actually support it transparently but you can always fake it by splitting up your table. For example, given the following table wherein column bar gets updated a lot but the others don't:

create table foo (
id      int     not null,
bar     int,
baz     int,

primary key (id)
);

You could split it up like so:

create table foo_a (
id      int,
baz     int,

primary key (id)
);

create table foo_b (
foo_id  int,
bar             int,

foreign key foo_a_id (foo_id) references foo_a (id)
);

The reason you'd ever want to do this is that when Postgres goes to update a row what it actually does is inserts a new row with the new value(s) that you changed and marks the old one as deleted. So, if you have a wide table and frequently update only certain columns, you'll take a performance hit as you're having to re-write a lot of static values.


I should mention that if you can handle splitting the
update into multiple transactions, that will help a
lot since it means you won't be doubling the size of
the table.

As I mentioned above, when you do an update you're actually inserting a new row and deleting the old one. That deleted row is still considered part of the table (for reasons of concurrency, read up on the concurrency chapter in the manual for the details) and once it is no longer visible by any live transactions can be re-used by future inserts. So, if you update one column on every row of a one million row table all at once, you have to allocate and write out one million new rows. But, if you do the update a quarter million at a time, the last three updates would be able to re-use many of the rows deleted in earlier updates.

Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to