Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

Tom Lane Sun, 23 Jan 2005 12:45:36 -0800

Simon Riggs <[EMAIL PROTECTED]> writes:
> Changing the idea slightly might be better: if a row update would cause
> a block split, then if there is more than one row version then we vacuum
> the whole block first, then re-attempt the update.


"Block split"?  I think you are confusing tables with indexes.

Chasing down prior versions of the same row is not very practical
anyway, since there is no direct way to find them.

One possibility is, if you tried to insert a row on a given page but
there's not room, to look through the other rows on the same page to see
if any are deletable (xmax below the GlobalXmin event horizon).  This
strikes me as a fairly expensive operation though, especially when you
take into account the need to get rid of their index entries first.
Moreover, the check would often be unproductive.

The real issue with any such scheme is that you are putting maintenance
costs into the critical paths of foreground processes that are executing
user queries.  I think that one of the primary advantages of the
Postgres storage design is that we keep that work outside the critical
path and delegate it to maintenance processes that can run in the
background.  We shouldn't lightly toss away that advantage.

There was some discussion in Toronto this week about storing bitmaps
that would tell VACUUM whether or not there was any need to visit
individual pages of each table.  Getting rid of useless scans through
not-recently-changed areas of large tables would make for a significant
reduction in the cost of VACUUM.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

Reply via email to