Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)
On Mon, 2007-03-19 at 10:29 -0500, Merlin Moncure wrote: > On 3/17/07, Simon Riggs <[EMAIL PROTECTED]> wrote: > > I'm very comfortable with the idea that HOT can be turned on/off for a > > table. That gives us a workaround to bugs. Previously, changing things > > like WITHOUT OIDS was done over two releases, so I'd suggest the same > > thing here. Add the option now, disabled, then look to make it the > > default option in the next release. We can override that with the > > default_use_hot parameter for those that feel bold, at least initially. > > I know Bruce has been long opposed to the idea of a table-level switch, > > which is why we've been trying so hard to avoid it. So we should add his > > -1 to this idea from the start. > > Is fear of bugs a justification of guc setting? Probably not on its own, but the inspiration was that we currently have user-visible behaviour in the recent proposals, hence the GUC. > Or is there a trade-off involved with HOT? At the moment, there is no downside to HOT in normal operation that I'm aware of, but its a great question. The problem we have is with normal CREATE INDEX because there are two sources of race conditions that complicate this: concurrent index scans and crash safety. Currently there are no perfect solutions to this. We have two main options: 1. additional locking, either within CIDX or as a separate DDL 2. additional complexity and possible limitation in the number of indexes to just 3 before we stop doing HOT updates. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)
Hannu Krosing wrote: Ühel kenal päeval, E, 2007-03-19 kell 12:05, kirjutas Simon Riggs: On Mon, 2007-03-19 at 10:51 +, Heikki Linnakangas wrote: Pavan Deolasee wrote: We've already used three of those, two for tracking HEAP_ONLY and HOT_UPDATED tuples and one for tracking fragmented tuple. HEAP_ONLY_TUPLE would go away in favor of the per-index bits. So we have bits available for three indexes. But you probably have to do some kind of SUPERFULL VACUUM if you want to DROP and CREATE the third index. You will probably have to touch all tuples, regardless of weather they are live or not, or if will be moved or not, just to kclean ot bits for the just-deleted index. DROP INDEX wouldn't do anything extra. CREATE INDEX would have to clear the bit assigned to the new index, which would mean dirtying every heap page in the worst case. As a further optimization, CREATE INDEX could skip index inserts for HOT-updated tuples, if the key for new index wasn't changed, and leave the flag set. Vacuum isn't needed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: CREATE INDEX and HOT (was [HACKERS]Question:pg_classattributes and race conditions ?)
On Mon, 2007-03-19 at 16:06 +0530, Pavan Deolasee wrote: > Simon Riggs wrote: > > > > This problem is solved by moving the wait (for all transactions in > > reference snapshot to finish) so that it is now between the first and > > second scans, as described. > > > > During the second Vscan we would prune each block, so the only remaining > > tuple in the block when the second scan sees it would be (10,30) and it > > would no longer be a HOT tuple - the index would have a pointer to it, > > so no new index pointer would be added. The pointer to (10,30) is the > > same pointer that was added in the first phase for the tuple (10,20). > > > > The problem is that in the first phase, the pointer was inserted > with key=20 whereas now its changed to 30. So we need to delete the old > index entry and add a new one. So don't index HOT tuples in the first phase, wait until the second. That should be just a single if() test in IndexBuildHeapScan(). -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)
Simon Riggs wrote: > > This problem is solved by moving the wait (for all transactions in > reference snapshot to finish) so that it is now between the first and > second scans, as described. > > During the second Vscan we would prune each block, so the only remaining > tuple in the block when the second scan sees it would be (10,30) and it > would no longer be a HOT tuple - the index would have a pointer to it, > so no new index pointer would be added. The pointer to (10,30) is the > same pointer that was added in the first phase for the tuple (10,20). > The problem is that in the first phase, the pointer was inserted with key=20 whereas now its changed to 30. So we need to delete the old index entry and add a new one. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: CREATE INDEX and HOT (was [HACKERS]Question:pg_classattributes and race conditions ?)
On Mon, 2007-03-19 at 09:28 +, Simon Riggs wrote: > On Mon, 2007-03-19 at 14:33 +0530, Pavan Deolasee wrote: > > Simon Riggs wrote: > > > > > > > > > We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is > > > I think we can without significant difficulty. > > > > > > > Yeah, I think CREATE INDEX CONCURRENTLY is much easier to solve. Though > > I am not completely convinced that we can do that without much changes > > to CREATE INDEX CONCURRENTLY logic. For example, I believe we still > > need to lock out HOT-updates before we start CREATE INDEX CONCURRENTLY. > > Otherwise we might end up creating two paths to the same tuple in > > the new index. > > > > Say, we have a table with two columns (int a, int b). We have an > > index on 'a' and building another index on 'b'. We got a tuple > > (10, 20) in the heap. In the first phase of CREATE INDEX CONCURRENTLY, > > this tuple would be indexed. If the tuple is HOT-updated to (10, 30) > > before the first phase ends, the updated tuple would again get > > indexed in the second phase. This would lead to two paths to the > > latest visible tuple from the new index. > > > > Am I missing something in your design that stops this from > > happening ? > > This problem is solved by moving the wait (for all transactions in > reference snapshot to finish) so that it is now between the first and > second scans, as described. > > During the second scan we would prune each block, so the only remaining > tuple in the block when the second scan sees it would be (10,30) and it > would no longer be a HOT tuple - the index would have a pointer to it, > so no new index pointer would be added. The pointer to (10,30) is the > same pointer that was added in the first phase for the tuple (10,20). > > The wait and subsequent prune ensures that all HOT tuples are now the > root of their HOT chain. The index created in the fist phase ensures > that the HOT chains are never added to. AFAICS this is all you need to make CREATE INDEX CONCURRENTLY work with HOT, which is even simpler than my original post. [This presumes that we do pruning automatically on a heap scan, not sure what the current state of that is, but it could be a scan option]. Index: src/backend/commands/indexcmds.c === RCS file: /projects/cvsroot/pgsql/src/backend/commands/indexcmds.c,v retrieving revision 1.157 diff -c -r1.157 indexcmds.c *** src/backend/commands/indexcmds.c13 Mar 2007 00:33:39 - 1.157 --- src/backend/commands/indexcmds.c19 Mar 2007 09:49:56 - *** *** 497,507 ActiveSnapshot = snapshot; /* -* Scan the index and the heap, insert any missing index entries. -*/ - validate_index(relationId, indexRelationId, snapshot); - - /* * The index is now valid in the sense that it contains all currently * interesting tuples. But since it might not contain tuples deleted just * before the reference snap was taken, we have to wait out any --- 497,502 *** *** 514,519 --- 509,519 for (ixcnt = 0; ixcnt < snapshot->xcnt; ixcnt++) XactLockTableWait(snapshot->xip[ixcnt]); + /* +* Scan the index and the heap, insert any missing index entries. +*/ + validate_index(relationId, indexRelationId, snapshot); + /* Index can now be marked valid -- update its pg_index entry */ pg_index = heap_open(IndexRelationId, RowExclusiveLock); -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)
On Mon, 2007-03-19 at 14:33 +0530, Pavan Deolasee wrote: > Simon Riggs wrote: > > > > > > We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is > > I think we can without significant difficulty. > > > > Yeah, I think CREATE INDEX CONCURRENTLY is much easier to solve. Though > I am not completely convinced that we can do that without much changes > to CREATE INDEX CONCURRENTLY logic. For example, I believe we still > need to lock out HOT-updates before we start CREATE INDEX CONCURRENTLY. > Otherwise we might end up creating two paths to the same tuple in > the new index. > > Say, we have a table with two columns (int a, int b). We have an > index on 'a' and building another index on 'b'. We got a tuple > (10, 20) in the heap. In the first phase of CREATE INDEX CONCURRENTLY, > this tuple would be indexed. If the tuple is HOT-updated to (10, 30) > before the first phase ends, the updated tuple would again get > indexed in the second phase. This would lead to two paths to the > latest visible tuple from the new index. > > Am I missing something in your design that stops this from > happening ? This problem is solved by moving the wait (for all transactions in reference snapshot to finish) so that it is now between the first and second scans, as described. During the second scan we would prune each block, so the only remaining tuple in the block when the second scan sees it would be (10,30) and it would no longer be a HOT tuple - the index would have a pointer to it, so no new index pointer would be added. The pointer to (10,30) is the same pointer that was added in the first phase for the tuple (10,20). The wait and subsequent prune ensures that all HOT tuples are now the root of their HOT chain. The index created in the fist phase ensures that the HOT chains are never added to. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)
On Sun, 2007-03-18 at 00:44 +0530, Pavan Deolasee wrote: > Simon Riggs wrote: > > > > We need to be clear that we already have a solution to CREATE INDEX > > CONCURRENTLY. Do you agree that we do? Does anyone see a problem with > > the posted design for that? If we have solved CREATE INDEX CONCURRENTLY, then I would propose that this becomes the default option for creating an index, when the statement is issued outside of a statement block. That seems better than reminding everybody to run with the CONCURRENTLY option, or advise them of different performance characteristics or behaviour of the normal CREATE INDEX route. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly