Tom Lane wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
I'm thinking of keeping an in-memory mapping of old and new tids of
updated tuples while clustering, instead. That means that cluster
requires a little bit of memory for each RECENTLY_DEAD updated tuple. In
the worst case that means that you run out of memory if there's too many
of those in the table, but I doubt that's going to be a problem in practice.
That is more or less isomorphic to what VACUUM FULL does. While people
have complained about VACUUM FULL's memory usage on occasion, just at
the moment I feel that the main problem with it is complexity. If we
still haven't gotten all the bugs out of VACUUM FULL after more than
eight years of work on it, what are the odds that we can make CLUSTER
do it right the first time?
Well, I can't guarantee that there's no bugs.
To copy a chain correctly, we need to correctly detect tuples that have
a t_ctid pointing to a non-dead tuple (non-dead meaning
HeapTupleSatisfiesVacuum(tuple) != DEAD), and tuples that are being
pointed to by a non-dead tuple. If we incorrectly detect that a tuple
belongs to either of those categories, when in fact it doesn't, we don't
corrupt anything, but we waste a little bit of memory memorizing the
To detect tuples in the first category, we need to check that xmax of
the tuple isn't invalid, and t_ctid doesn't point to itself.
To detect tuples in the second category, we need to check that xmin
isn't invalid, and is greater than OldestXmin.
With both categories correctly identified, it's just a matter of mapping
old ctids to corresponding tids in the new heap.
Unlike in my first proposal, if something nevertheless goes wrong in
detecting the chains, we only lose the chaining between the tuples, but
we don't otherwise lose any data. The latest version of each row is fine
anyway. I think this approach is pretty robust, and it fails in a good way.
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?