Sorry to start another thread while we are still discussing CREATE
INDEX design, but I need help/suggestions to finish the patch on
time for 8.3
We earlier thought that CREATE INDEX CONCURRENTLY (CIC)
would be simpler to do because of the existing waits in CIC.
But one major problem with CIC is that UPDATEs are allowed
while we are building the index and these UPDATEs can create
HOT-chains which has different values for attributes on which
we are building the new index. To keep the HOT-chain semantic
consistent across old and new indexes, we might be forced to
delete the old index entry and reinsert new one during the
validate_index() phase. This is of course not easy.
May I propose the following design which is less intrusive:
We do CIC in three phases:
In the first phase we just create the catalog entry for the new
index, mark the index read-only and commit the transaction.
By read-only, I mean that the index is not ready inserts, but
is consulted during UPDATEs to decide whether to do HOT
UPDATE or not (just like other existing indexes). We then
wait for all transactions conflicting on ShareLock to complete.
That would guarantee that all the existing transactions which
can not see the new index catalog entry are finished.
A new transaction is started. We then build the index just the
way we do today. While we are building the index, no new
HOT-chains are be created where the index keys do not
match because the new index is consulted while deciding
whether to do HOT UPDATE or not.
At the end of this step, the index is marked ready for
inserts, we once again wait for all transactions conflicting
on ShareLock to finish and commit the transaction.
In the third phase, we validate the index inserting any
missing index entries for tuples which are not HEAP_ONLY.
For HEAP_ONLY tuples we already have the index entry
though it points to the root tuple. Thats OK because we
guarantee that all tuples in the chain share the same key
with respect to old as well as new indexes.
We then mark the index "valid" and commit.
In summary, this design introduces one more transaction
and wait. But that should not be a problem because we
would anyways wait for those transactions to finish today
though a bit later in the process.