Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
Pavan Deolasee wrote: 2. Heikki suggested an approach where we add a byte to tuple header and track HOT-ness of different indexes. The idea looks good but had a downside of increasing tuple header and complexity. We would only need the extra byte in HOT-updated tuples. Alternatively, we could use the bits we have free in infomask2. There's currently 5 bits free, using just 2 or 3 of those would get us quite far. Or just one, which would be the Tom's suggestion of only using HOT for tables with a single index. Complexity is in the eye of the beholder. Chilling existing tuples isn't exactly trivial either, and neither is getting all the locking and waiting needed in the other proposals correct. The simplicity of the other proposals depend a lot on what kind of restrictions and changes to current semantics of CREATE INDEX [CONCURRENTLY] we accept. Which of the following restrictions are we OK with, if a table has HOT-updated tuples: 1. Throw an error 2. Require a vacuum after crash during CREATE INDEX 3. Do multiple heap-scan passes 4. Wait longer in CREATE INDEX CONCURRENTLY 5. Wait in CREATE INDEX, like we do in CREATE INDEX CONCURRENTLY 6. Lock the table exclusively 7. Disallow multiple CREATE INDEXes at the same time. I've lost track of which proposals lead to which restrictions. Maybe we should look at the restrictions first, and judge which ones are acceptable and which ones are not? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
Heikki Linnakangas wrote: Pavan Deolasee wrote: 2. Heikki suggested an approach where we add a byte to tuple header and track HOT-ness of different indexes. The idea looks good but had a downside of increasing tuple header and complexity. We would only need the extra byte in HOT-updated tuples. Alternatively, we could use the bits we have free in infomask2. There's currently 5 bits free, using just 2 or 3 of those would get us quite far. Or just one, which would be the Tom's suggestion of only using HOT for tables with a single index. We've already used three of those, two for tracking HEAP_ONLY and HOT_UPDATED tuples and one for tracking fragmented tuple. Doing it for just one index seems too restrictive. Are we ok with adding another byte to the tuple header ? Complexity is in the eye of the beholder. Chilling existing tuples isn't exactly trivial either, and neither is getting all the locking and waiting needed in the other proposals correct. I agree. I am just worried about the short term and long term solution. Your proposal is certainly the better of all as it also gives us the ability to restrict bloats on a index whose key does not change during UPDATE. I would like to do something which is acceptable and is also feasible to complete by feature freeze. Do you want to give a shot to this approach while I try to build the ALTER TABLE and CHILL utilities ? The simplicity of the other proposals depend a lot on what kind of restrictions and changes to current semantics of CREATE INDEX [CONCURRENTLY] we accept. Which of the following restrictions are we OK with, if a table has HOT-updated tuples: 1. Throw an error 2. Require a vacuum after crash during CREATE INDEX 3. Do multiple heap-scan passes 4. Wait longer in CREATE INDEX CONCURRENTLY 5. Wait in CREATE INDEX, like we do in CREATE INDEX CONCURRENTLY 6. Lock the table exclusively 7. Disallow multiple CREATE INDEXes at the same time. I've lost track of which proposals lead to which restrictions. Maybe we should look at the restrictions first, and judge which ones are acceptable and which ones are not? This is a good summary. With the assumption that creating index is not very frequent operation, I would live with 1, 2, 3 and 4. But frankly I'm least knowledgable in this regard and would rely on others to decide. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
Pavan Deolasee wrote: Heikki Linnakangas wrote: Pavan Deolasee wrote: We would only need the extra byte in HOT-updated tuples. Alternatively, we could use the bits we have free in infomask2. There's currently 5 bits free, using just 2 or 3 of those would get us quite far. Or just one, which would be the Tom's suggestion of only using HOT for tables with a single index. We've already used three of those, two for tracking HEAP_ONLY and HOT_UPDATED tuples and one for tracking fragmented tuple. HEAP_ONLY_TUPLE would go away in favor of the per-index bits. So we have bits available for three indexes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
Pavan Deolasee [EMAIL PROTECTED] writes: While creating an index, if a HEAP_ONLY tuple is found, CREATE INDEX [CONCURRENTLY] fails with an error and the user needs to SET HOT OFF and then try again. While turning HOT off, the entire table is CHILLed, holding AccessExclusive lock on the table. Once the new index is created, user can turn HOT on again. It hardly seems acceptable to require exclusive lock to chill a table. In production situations, knowing that you'd have to do that to do index maintenance on a large table would probably scare you off of ever enabling the feature at all. Last year we were getting beaten up about how it wasn't acceptable for CREATE INDEX to lock out writes for a long time; how is it suddenly acceptable to need to lock out both reads and writes for a long time before you can even think about creating an index? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
Tom Lane wrote: Pavan Deolasee [EMAIL PROTECTED] writes: While creating an index, if a HEAP_ONLY tuple is found, CREATE INDEX [CONCURRENTLY] fails with an error and the user needs to SET HOT OFF and then try again. While turning HOT off, the entire table is CHILLed, holding AccessExclusive lock on the table. Once the new index is created, user can turn HOT on again. It hardly seems acceptable to require exclusive lock to chill a table. In production situations, knowing that you'd have to do that to do index maintenance on a large table would probably scare you off of ever enabling the feature at all. Last year we were getting beaten up about how it wasn't acceptable for CREATE INDEX to lock out writes for a long time; how is it suddenly acceptable to need to lock out both reads and writes for a long time before you can even think about creating an index? Yeah, I agree. I was proposing this as a stop-gap solution though. Something which would help us solve the problem without changing the current behavior for non-HOT tables. So what do you suggest ? Do you feel that there is no way we can solve the problem ? ISTM that if we run CHILL as a seperate transaction (just like VACUUM), we should be able to CHILL the table with ShareUpdateExclusiveLock. Running it as a seperate transaction would reduce the risk of causing deadlocks. Is that a fair assessment ? If we need to CHILL with ShareUpdateExclusiveLock, IMHO we would again be back to something similar to the first approach. I know you said its fragile and full of race conditions, but do you think we can handle it better if we have a seperate DDL command, running within its own transaction ? The algorithm would look like: 1. Disable HOT-updates 2. CHILL the table by inserting appropriate index entries and marking tuples CHILL_IN_PROGRESS 3. Establish a point when there are no open index scans 4. Disable HOT-fetches 5. Scan the heap again, reset CHILL_IN_PROGRESS, HEAP_ONLY and HOT_UPDATED flags 6. Establish a point when there are no open index scans 7. Enable HOT-fetches 8. Enable HOT-updates I need help to do the step 1,3,4,6,7 and 8 (well all :-)) in a deadlock and race condition free mannner. Any suggestions ? Where do we keep the global state about HOT-updates/HOT-fetches ? If we keep it in pg_class, a crash of the CHILL command or the server may leave the pg_class row in a stale state. That does not look like a problem though. In the worst case, we might not be able to do HOT-updates without manual intervention. Again comments, suggestions ? I really appreciate everyone's time and patience. Help is what I need to solve this problem. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate