Re: [PATCHES] Latest README.HOT

2007-09-16 Thread Pavan Deolasee
On 9/16/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> Forgot to include this in the patch ...
>
> I'm still unhappy about too much complexity for the VACUUM FULL case,
> but some other issues have gone away.
>
>

The entire complexity in the VACUUM FULL code is for the book keeping
purposes so that at the end we can recheck the index and heap tuples
count. VACUUM FULL converts HOT tuples into non-HOT tuples when it
moves them around and that introduces the extra complexity. Otherwise
there is no significant change to the VACUUM FULL logic itself.

Good to hear that we are now comfortable with most of the other items.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


[PATCHES] Latest README.HOT

2007-09-15 Thread Tom Lane
Forgot to include this in the patch ...

I'm still unhappy about too much complexity for the VACUUM FULL case,
but some other issues have gone away.

regards, tom lane


$PostgreSQL$

   Heap Only Tuples (HOT)

Introduction


The Heap Only Tuple (HOT) feature eliminates redundant index entries and
allows the re-use of space taken by DELETEd or obsoleted UPDATEd tuples
without performing a table-wide vacuum.  It does this by allowing
single-page vacuuming, also called "defragmentation".

Note: there is a Glossary at the end of this document that may be helpful
for first-time readers.


Technical Challenges


Page-at-a-time vacuuming is normally impractical because of the costs of
finding and removing the index entries that link to the tuples to be
reclaimed.  Standard vacuuming scans the indexes to ensure all such index
entries are removed, amortizing the index scan cost across as many dead
tuples as possible; this approach does not scale down well to the case of
reclaiming just a few tuples.  In principle one could recompute the index
keys and do standard index searches to find the index entries, but this is
risky in the presence of possibly-buggy user-defined functions in
functional indexes.  An allegedly immutable function that in fact is not
immutable might prevent us from re-finding an index entry (and we cannot
throw an error for not finding it, in view of the fact that dead index
entries are sometimes reclaimed early).  That would lead to a seriously
corrupt index, in the form of entries pointing to tuple slots that by now
contain some unrelated content.  In any case we would prefer to be able
to do vacuuming without invoking any user-written code.

HOT solves this problem for a restricted but useful special case:
where a tuple is repeatedly updated in ways that do not change its
indexed columns.  (Here, "indexed column" means any column referenced
at all in an index definition, including for example columns that are
tested in a partial-index predicate but are not stored in the index.)

An additional property of HOT is that it reduces index size by avoiding
the creation of identically-keyed index entries.  This improves search
speeds.


Update Chains With a Single Index Entry
---

Without HOT, every version of a row in an update chain has its own index
entries, even if all indexed columns are the same.  With HOT, a new tuple
placed on the same page and with all indexed columns the same as its
parent row version does not get new index entries.  This means there is
only one index entry for the entire update chain on the heap page.
An index-entry-less tuple is marked with the HEAP_ONLY_TUPLE flag.
The prior row version is marked HEAP_HOT_UPDATED, and (as always in an
update chain) its t_ctid field links forward to the newer version.

For example:

Index points to 1
lp [1]  [2]

[1]->[22]

In the above diagram, the index points to line pointer 1, and tuple 1 is
marked as HEAP_HOT_UPDATED.  Tuple 2 is a HOT tuple, meaning it has
no index entry pointing to it, and is marked as HEAP_ONLY_TUPLE.
Although tuple 2 is not directly referenced by the index, it can still be
found by an index search: after traversing from the index to tuple 1,
the index search proceeds forward to child tuples as long as it sees the
HEAP_HOT_UPDATED flag set.  Since we restrict the HOT chain to lie within
a single page, this requires no additional page fetches and doesn't
introduce much performance penalty.

Eventually, tuple 1 will no longer be visible to any transaction.
At that point its space could be reclaimed, but its line pointer cannot,
since the index still links to that line pointer and we still need to
be able to find tuple 2 in an index search.  HOT handles this by turning
line pointer 1 into a "redirecting line pointer", which links to tuple 2
but has no actual tuple attached.  This state of affairs looks like

Index points to 1
lp [1]->[2]

[22]

If now the row is updated again, to version 3, the page looks like this:

Index points to 1
lp [1]->[2]  [3]

[22]->[33]

At some later time when no transaction can see tuple 2 in its snapshot,
tuple 2 and its line pointer can be pruned entirely:

Index points to 1
lp [1]-->[3]

[33]

This is safe because no index entry points to line pointer 2.  Subsequent
insertions into the page can now recycle both line pointer 2 and the
space formerly used by tuple 2.

If an update changes any indexed column, or there is not room on the
same page for the new tuple, then the HOT chain ends: the last member
has a regular t_ctid link to the next version and is not marked
HEAP_HOT_UPDATED.  (In principle we could continue a HOT chain across
pages, but this would destroy the desired property of being able to
reclaim space with just pa