Gregory Stark wrote: > > "Bruce Momjian" <[EMAIL PROTECTED]> writes: > > > I have taken this, and Pavan's documentation about CREATE INDEX, and > > worked up an updated README. Comments? Corrections? > > Oh, you I think the CREATE INDEX documentation you refer to was actually the > one I suggested.
Sorry, I see now it was you who wrote that section, not Pavan. > A few tweaks: > > > (If we find any HOT-updated tuples with RECENTLY_DEAD or > > DELETE_IN_PROGRESS we ignore it assuming that we will also come across > > the _end_ of the update chain and index that instead.) > > There's more to this. We build a mapping telling us the Root tuple for each > tuple in the page. Then when we scan tuples looking for the Head of each HOT > chain (ie, a tuple that wasn't HOT updated) and index the corresponding Root > from the map using the key value from the Head tuple. > > We treat DELETE_IN_PROGRESS the same way we treat RECENTLY_DEAD (and > INSERT_IN_PROGRESS the same as LIVE) because we assume it's been deleted (or > inserted) by our own transaction. So while it's not actually committed yet we > can assume it is since if its transaction aborts the index creation itself > will be aborted. Other transactions cannot be deleting or inserting tuples > without having committed or aborted already because we have a lock on the > table and the other transactions normally keep their locks until they exit. Updated text: Because we have a lock on the table, any RECENTLY_DEAD, DELETE_IN_PROGRESS, and INSERT_IN_PROGRESS tuples belong to our own transction. Therefore we can consider them committed since if the CREATE INDEX commits, they will be committed, and if it aborts the index is discarded. > NOTE: This is something likely to change. Current discussions are leading > towards handling DELETE_IN_PROGRESS and INSERT_IN_PROGRESS from other > transactions. We would do this by waiting until the transactions owning those > tuples exit. This would allow us to index tables being used by transactions > which release their locks early to work. In particular this happens for system > tables. OK, we will just have to update the README then. > > The tricky case arises with queries executed in the same transaction as > > CREATE INDEX. In the case of a new table created within the same > > transaction (such as with pg_dump), the index will be usable because > > there will never be any HOT update chains so the indcreatexid will never > > be set. > > This is unclear and perhaps misleading. I think it needs to be more like "In > the case of a new table in which rows were inserted but none updated (such as > with pg_dump) the index will be usable because ..." Done. > > Also in the case of a read-committed transaction new queries will be able to > > use the index. A serializable transaction building an index on an existing > > table with HOT updates cannot not use the index. > > I don't think this is clear and I'm not sure it's right. > > Currently the transaction that actually did the CREATE INDEX has to follow the > same rules as other transactions. This means if there were any visible hot > updated tuples and the index is therefore marked with our xid in indcreatexid > we will *not* be able to use it in the same transaction as our xid is never in > our serializable snapshot. This is true even if we're not in serializable mode > as we cannot know what earlier snapshots are still in use and may be used with > the new plan. Updated paragraph: The CREATE INDEX transaction cannot not use the index if HOT rows exist in the table. Fortunately, many CREATE INDEX transactions use new tables in which rows are inserted but not updated (such as with pg_dump). Such tables have no HOT rows so the index will be usable because indcreatexid will never be set. > NOTE: This again is something likely to change. In many cases it ought to be > possible to have the transaction use the index it just built even if there > were visible HOT updated tuples in it. > > In particular in READ COMMITTED transactions which have no outstanding > commands using early snapshots then subsequent planning ought to be able to > use the index. Even if outstanding commands are using old snapshots if we can > be sure they can't use the new plan then it would still be safe to use the > index in the new plan. Also in SERIALIZABLE mode those same statements hold > for temporary tables. OK, we can update the README at that point. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq