Re: [HACKERS] CLUSTER and MVCC

Heikki Linnakangas Mon, 12 Mar 2007 14:25:26 -0800

Heikki Linnakangas wrote:

Tom Lane wrote:
The reason it's not trivial is that you also have to preserve the t_ctid
links of update chains.  If you look into VACUUM FULL, a very large part
of its complexity is that it moves update chains as a unit to make that
possible.  (BTW, I believe the problem Pavan Deolasee reported yesterday
is a bug somewhere in there --- it looks to me like sometimes the same
update chain is getting copied multiple times.)
Ah, that's it. Thanks.
The easiest solution I can think of is to skip newer versions of updatedrows when scanning the old relation, and to fetch and copy all tuples inthe update chain to the new relation whenever you encounter the firsttuple in the chain.
To get a stable view of what's the first tuple in chain, you need to getthe oldest xmin once at the beginning, and use that throughout theoperation. Since we take an exclusive lock on the table, no-one caninsert new updated tuples during the operation, and all updaters arefinished before the lock is granted.

I've been thinking about this some more, and I think the above wouldwork. The tricky part is to recognize the first tuple in a chain, giventhe recent bug in vacuum full.

In each chain, there must be at least one non-dead tuple with xmin <Oldestxmin. Otherwise we're already missing tuples; there would be noversion visible to a transaction with xid between OldestXmin and the minxmin present in the chain. That tuple is the root of the chain.

There might be a dead tuple in the middle of the chain. Dead tuples havexmax < OldestXmin, which means there must be another non-dead tuple inthe chain with xmax >= OldestXmin. For cluster's purposes, that anothernon-dead tuple is also considered as a root. Dead tuples and chainsending in a dead tuples don't need to be stored in the new table.

This picture helped me a lot:http://community.enterprisedb.com/updatechain.gif

Arrows represent tuples, beginning at xmin and ending at xmax.OldestXmin is represented by the vertical bar, everything to the left issmaller than OldestXmin and everything to the right is larger thanOldestXmin. The chain must begin with a tuple with xmin on the left sideof OldestXmin, and the last tuple in the chain must end on the right side.


Does anyone see a problem with this? If not, I'm going to write a patch.

One potential issue I'm seeing is that if we rely on the unbroken chainstarting from < OldestXmin, and that tuple isn't there because of a bug,for example, the later version of the tuple is skipped and the row is lost.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [HACKERS] CLUSTER and MVCC

Reply via email to