Re: [HACKERS] CLUSTER and MVCC

Heikki Linnakangas Fri, 09 Mar 2007 08:31:06 -0800

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to methat it would be trivial to fix, by using SnapshotAny instead ofSnapshotNow, and not overwriting the xmin/xmax with the xid of thecluster command.
The reason it's not trivial is that you also have to preserve the t_ctid
links of update chains.  If you look into VACUUM FULL, a very large part
of its complexity is that it moves update chains as a unit to make that
possible.  (BTW, I believe the problem Pavan Deolasee reported yesterday
is a bug somewhere in there --- it looks to me like sometimes the same
update chain is getting copied multiple times.)


Ah, that's it. Thanks.

The easiest solution I can think of is to skip newer versions of updatedrows when scanning the old relation, and to fetch and copy all tuples inthe update chain to the new relation whenever you encounter the firsttuple in the chain.

To get a stable view of what's the first tuple in chain, you need to getthe oldest xmin once at the beginning, and use that throughout theoperation. Since we take an exclusive lock on the table, no-one caninsert new updated tuples during the operation, and all updaters arefinished before the lock is granted.

Those tuples wouldn't be in the cluster order, though, but that's not abig deal.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] CLUSTER and MVCC

Reply via email to