"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> CREATE INDEX
> CREATE INDEX CONCURRENTLY
> I'm not very familiar with how these, so I'll just shut up..
Here is an addendum for how HOT handles CREATE INDEX and CREATE INDEX
There are two strategies for CREATE INDEX depending on whether we're doing it
concurrently or not. For a regular locking create index we take the two
1) When indexing a table we must take care that the index contains pointers
only to Root tuples. These are the non-HOT tuples which everyone else
expects to have index pointers.
However the key value we want to index is actually the key at the *end* of
the chain, ie, the most recent tuple on the index chain. Note that this
means we could create broken HOT chains because the index key of the new
index could have changed in the HOT chain.
If we find any HOT-updated tuple with is RECENTLY_DEAD or
DELETE_IN_PROGRESS we ignore it assuming that we'll also come across the
*end* of the update chain and index that instead.
2) If we ignore any HOT-updated tuples above then we mark the index as
unusable by older transactions for which those tuples might be visible.
This ensures that nobody for whom the older tuples in a broken HOT chain
are visible will use the index.
We do this by putting our transaction-id in pg_index.indcreatexid after
building the index. Queries check whether that indcreatexid is in their
serializable snapshot, if it isn't then the index is not usable for that
This means that transactions started before the create index commits will
not get the benefit of the new index even if they first scan the table
only after the index exists. However new transactions get the benefit of
the new index immediately but they will always follow the HOT update
chain since the old tuples with the possibly different keys will never be
visible to them.
The tricky case arises with queries executed in the same transaction as
the CREATE INDEX. In the case of a new table created within the same
transaction such as with pg_dump -1 the index will always be usable
because there will never be any HOT update chains so the indcreatexid
will never be set. Also in the case of a read-committed transaction new
queries will always be able to use the index. Only in the case of a
serializable transaction building an index on an existing table with HOT
updates in it will the index go unused.
CREATE INDEX CONCURRENTLY
In the concurrent case we take a different approach.
1) we create the pg_index entry immediately, before we scan the table. The
pg_index is marked as "not ready for inserts". Then we commit and wait for
any transactions which have the table open to finish. This ensures that no
*new* HOT updates will change the key value for our new index.
2) We wait out any transactions which had the table open. Then we build the
index with a snapshot. Because we waited out anyone who might have been
around before the index was created any tuples seen in the snapshot will
have only valid HOT chains following them. They may still have older HOT
updates behind them which are broken. As above we point the index pointers
at the Root of the HOT-update chain but we use the key from the end of the
4) We mark the index open for inserts (but still not ready for reads) then we
again wait out any transactions which had the table open. Then we take a
second reference snapshot and validate the index. This searches for tuples
missing from the index -- but it again has to look bup the root of the HOT
chains and search for those tuples in the index.
4) Then we wait until *every* transaction in progress in the validate_index
reference snapshot is finished. This ensures that nobody is alive any
longer who could possibly see any of the tuples in a broken HOT chain.
Broken HOT Chain
. A HOT chain in which the key value for an index has changed. This is
not allowed to occur normally but if a new index is created it can happen. In
that case various strategies are used to ensure that no transaction for which
the older tuples are visible can use the index.
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend