Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Mon, Mar 19, 2007 at 12:05:19PM +, Simon Riggs wrote: I was unwilling to compromise to have HOT if only one index existed, but IMHO allowing HOT with = 3 indexes is an acceptable compromise for this release. (We can always use vertical partitioning techniques to allow additional access paths to be added to the same table - I'd be very happy to document that with worked examples, if requried). I'm not sure where we're sitting with this, but I've got another idea I haven't seen (one that I think is better than an arbitrary limit on the number of indexes)... what if we just disallow non-concurrent index builds on hot tables? It sounds like the additional pain involved in chilling an entire table and keeping it chilled for the index build is even more overhead than just doing a concurrent index build. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Jim C. Nasby wrote: On Mon, Mar 19, 2007 at 12:05:19PM +, Simon Riggs wrote: I was unwilling to compromise to have HOT if only one index existed, but IMHO allowing HOT with = 3 indexes is an acceptable compromise for this release. (We can always use vertical partitioning techniques to allow additional access paths to be added to the same table - I'd be very happy to document that with worked examples, if requried). I'm not sure where we're sitting with this, but I've got another idea I haven't seen (one that I think is better than an arbitrary limit on the number of indexes)... what if we just disallow non-concurrent index builds on hot tables? It sounds like the additional pain involved in chilling an entire table and keeping it chilled for the index build is even more overhead than just doing a concurrent index build. I thought about making it even simpler. Only allow CREATE INDEX builds on non HOT tables. However as I mentioned in another thread, this idea dies if you expect to be able to have HOT enabled by default in any future release. Chilling needs to be able to be done with a regular Vacuum style lock for chilling to be a usable reality. I'm sure there are use cases or this, but it seems unlikely that a high update table is going to have an index added to it. Am I a long way from reality when saying that? Regards Russell Smith ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Simon Riggs wrote: We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is I think we can without significant difficulty. Yeah, I think CREATE INDEX CONCURRENTLY is much easier to solve. Though I am not completely convinced that we can do that without much changes to CREATE INDEX CONCURRENTLY logic. For example, I believe we still need to lock out HOT-updates before we start CREATE INDEX CONCURRENTLY. Otherwise we might end up creating two paths to the same tuple in the new index. Say, we have a table with two columns (int a, int b). We have an index on 'a' and building another index on 'b'. We got a tuple (10, 20) in the heap. In the first phase of CREATE INDEX CONCURRENTLY, this tuple would be indexed. If the tuple is HOT-updated to (10, 30) before the first phase ends, the updated tuple would again get indexed in the second phase. This would lead to two paths to the latest visible tuple from the new index. Am I missing something in your design that stops this from happening ? Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Mon, 2007-03-19 at 10:51 +, Heikki Linnakangas wrote: Pavan Deolasee wrote: Heikki Linnakangas wrote: Pavan Deolasee wrote: We would only need the extra byte in HOT-updated tuples. Alternatively, we could use the bits we have free in infomask2. There's currently 5 bits free, using just 2 or 3 of those would get us quite far. Or just one, which would be the Tom's suggestion of only using HOT for tables with a single index. We've already used three of those, two for tracking HEAP_ONLY and HOT_UPDATED tuples and one for tracking fragmented tuple. HEAP_ONLY_TUPLE would go away in favor of the per-index bits. So we have bits available for three indexes. ISTM that we are getting very close to a great idea here. I was unwilling to compromise to have HOT if only one index existed, but IMHO allowing HOT with = 3 indexes is an acceptable compromise for this release. (We can always use vertical partitioning techniques to allow additional access paths to be added to the same table - I'd be very happy to document that with worked examples, if requried). I trust that we will think of ways of extending that limit in later releases. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Ühel kenal päeval, E, 2007-03-19 kell 12:05, kirjutas Simon Riggs: On Mon, 2007-03-19 at 10:51 +, Heikki Linnakangas wrote: Pavan Deolasee wrote: Heikki Linnakangas wrote: Pavan Deolasee wrote: We would only need the extra byte in HOT-updated tuples. Alternatively, we could use the bits we have free in infomask2. There's currently 5 bits free, using just 2 or 3 of those would get us quite far. Or just one, which would be the Tom's suggestion of only using HOT for tables with a single index. We've already used three of those, two for tracking HEAP_ONLY and HOT_UPDATED tuples and one for tracking fragmented tuple. HEAP_ONLY_TUPLE would go away in favor of the per-index bits. So we have bits available for three indexes. But you probably have to do some kind of SUPERFULL VACUUM if you want to DROP and CREATE the third index. You will probably have to touch all tuples, regardless of weather they are live or not, or if will be moved or not, just to kclean ot bits for the just-deleted index. Maybe a CLUSTER would be an answer here. ISTM that we are getting very close to a great idea here. I was unwilling to compromise to have HOT if only one index existed, but IMHO allowing HOT with = 3 indexes is an acceptable compromise for this release. (We can always use vertical partitioning techniques to allow additional access paths to be added to the same table - I'd be very happy to document that with worked examples, if requried). Maybe using more than one TOAST table as means of vertical partitioning ? I trust that we will think of ways of extending that limit in later releases. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On 3/17/07, Simon Riggs [EMAIL PROTECTED] wrote: I'm very comfortable with the idea that HOT can be turned on/off for a table. That gives us a workaround to bugs. Previously, changing things like WITHOUT OIDS was done over two releases, so I'd suggest the same thing here. Add the option now, disabled, then look to make it the default option in the next release. We can override that with the default_use_hot parameter for those that feel bold, at least initially. I know Bruce has been long opposed to the idea of a table-level switch, which is why we've been trying so hard to avoid it. So we should add his -1 to this idea from the start. Is fear of bugs a justification of guc setting? Or is there a trade-off involved with HOT? merlin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On 3/19/07, Pavan Deolasee [EMAIL PROTECTED] wrote: Yeah, I think CREATE INDEX CONCURRENTLY is much easier to solve. Though I am not completely convinced that we can do that without much changes to CREATE INDEX CONCURRENTLY logic. For example, I believe we still need to lock out HOT-updates before we start CREATE INDEX CONCURRENTLY. Otherwise we might end up creating two paths to the same tuple in the new index. Say, we have a table with two columns (int a, int b). We have an index on 'a' and building another index on 'b'. We got a tuple (10, 20) in the heap. In the first phase of CREATE INDEX CONCURRENTLY, this tuple would be indexed. If the tuple is HOT-updated to (10, 30) before the first phase ends, the updated tuple would again get indexed in the second phase. This would lead to two paths to the latest visible tuple from the new index. just a thought...can you disable HOT on the fly? why not disable hot updates completely during these types of operations?. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Sat, 2007-03-17 at 12:43 +0530, Pavan Deolasee wrote: How do we move forward with the CREATE INDEX issue with HOT ? There are quite a few suggestions and objections. Can we please discuss and decide on the plan ? I am very comfortable with the current state of HOT, the results are encouraging and I hope this issue does not become a showstopper. My suggested plan would be: First, we should make CREATE INDEX CONCURRENTLY work, since we have a design to do that. Many other things are acceptable if that works fully and correctly with HOT, if it cannot, we have problems and that is a showstopper. Second, I can't put my finger on it exactly, but I think having a DDL command to chill a table, as Alvaro suggests, sounds like a very wise utility to include, whatever we do later. That does have the useful by-product of making CREATE INDEX work without change, but I would not want to rely on it in the longer term. While those are happening we can explore the other possibilities in more detail, but they do seem like safe bets to me. Here is what different people have suggested: 1. Simon and I posted couple of designs which include prior suggestions from Heikki, Tom, Hannu and others. I believe these designs are not very distinct and if done correctly would help us keep the current behavior of CREATE INDEX and CREATE INDEX CONCURRENTLY same. Agreed. Apologies for confusing things by posting the second design, that wasn't very helpful or polite. There are concerns about race conditions and deadlock issues though. What is a general feeling at this point ? Are these issues very hard to address ? At a high level, IMO we need some or all of these things to make either of these designs work: - ability to enable/disable HOT-updates on the table - ability to enable/disable HOT-fetches on the table - ability to wait for concurrent index scans to finish - ability to allow only one backend to CHILL the table That seems like a good summary of what we need. I would add only that these need not be explicit user-level commands, just internal capabilities. How hard would these be without causing any race conditions and deadlocks ? We should be able to take advantage of two things: - We only need to chill some of the tuples, in some cases. - Chilling only requires us to lock the indexes, which would be more practical if the index locking were more transitory. I note that IndexScans hold the locks they have on an index until transaction end, even after the IndexScans and BitmapIndexScans complete. ISTM we could relax that locking; that seems OK since we earlier discussed removing locks completely on indexes. The lock on the heap would not be upgraded or changed. 2. Heikki suggested an approach where we add a byte to tuple header and track HOT-ness of different indexes. The idea looks good but had a downside of increasing tuple header and complexity. I'm very comfortable with the idea that HOT can be turned on/off for a table. That gives us a workaround to bugs. Previously, changing things like WITHOUT OIDS was done over two releases, so I'd suggest the same thing here. Add the option now, disabled, then look to make it the default option in the next release. We can override that with the default_use_hot parameter for those that feel bold, at least initially. I know Bruce has been long opposed to the idea of a table-level switch, which is why we've been trying so hard to avoid it. So we should add his -1 to this idea from the start. Right now, ideas around (2) sound like too much complexity. Maybe there's a good idea there waiting to break out, so we should pursue that also - but I'm not sure you can wait for that to happen. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Sat, 2007-03-17 at 11:45 -0400, Tom Lane wrote: Pavan Deolasee [EMAIL PROTECTED] writes: While creating an index, if a HEAP_ONLY tuple is found, CREATE INDEX [CONCURRENTLY] fails with an error and the user needs to SET HOT OFF and then try again. While turning HOT off, the entire table is CHILLed, holding AccessExclusive lock on the table. Once the new index is created, user can turn HOT on again. It hardly seems acceptable to require exclusive lock to chill a table. In production situations, knowing that you'd have to do that to do index maintenance on a large table would probably scare you off of ever enabling the feature at all. Last year we were getting beaten up about how it wasn't acceptable for CREATE INDEX to lock out writes for a long time; how is it suddenly acceptable to need to lock out both reads and writes for a long time before you can even think about creating an index? I agree with you: It isn't acceptable for us to contemplate an AccessExclusiveLock before we can build any index. We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is I think we can without significant difficulty. The problems are with CREATE INDEX, in some cases. I regret that I did not see those difficulties until recently, which is why I was concerned that we spent time on VACUUM FULL rather than this issue. I'm glad now that you both pressed ahead and solved that though. As a result of the issues, I think Pavan is playing safe, to make sure there is *an* option, so that we can build upwards from there. The proposal is pragmatism only, while we discuss other approaches. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Simon Riggs wrote: As a result of the issues, I think Pavan is playing safe, to make sure there is *an* option, so that we can build upwards from there. The proposal is pragmatism only, while we discuss other approaches. Absolutely true. I agree that CHILLing the table with AccessExclusive lock is not a solution that I like. I was looking at it just as a stop-gap solution, given the time left for feature freeze. But if there is something better that we can do, I am for it. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Sat, 2007-03-17 at 23:11 +0530, Pavan Deolasee wrote: The algorithm would look like: 1. Disable HOT-updates The lock held by CREATE INDEX already prevents HOT updates. So steps 1 and 8 aren't needed. We need to be clear that we already have a solution to CREATE INDEX CONCURRENTLY. Do you agree that we do? Does anyone see a problem with the posted design for that? Hopefully it is only CREATE INDEX that we need to think about. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Simon Riggs wrote: We need to be clear that we already have a solution to CREATE INDEX CONCURRENTLY. Do you agree that we do? Does anyone see a problem with the posted design for that? Hopefully it is only CREATE INDEX that we need to think about. I agree. Lets first decide whether its only CREATE INDEX that needs solution or its both. Lets also decide whether we want to fix CREATE INDEX [CONCURRENTLY] or we want to provide a seperate DLL to CHILL the table and then build index normally. Tom has already rejected the idea of holding exclusive lock while chilling, but if we get around that, is rest of the approach acceptable ? If its going to be a seperate DLL, lets decide whether its acceptable to run it as a seperate transaction. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings