Re: [HACKERS] string_to_array eats too much memory?
Tom Lane writes: Tatsuo Ishii [EMAIL PROTECTED] writes: string_to_array() consumes too much memory. For example, to make ~70k array elements, string_to_array seems to eat several Gig bytes of memory. I'd argue that the problem comes from enlarging the work arrays only 64 elements at a time in accumArrayResult(). Most of the rest of the code deals with resizing arrays using a double it each time it has to grow approach, I wonder why this is different? Without reading the code, I guess that simply means O(n^2) runtime. This should be fixed, then, right? Best Regards, Michael Paesold ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Introducing an advanced Frequent Update Optimization
On Mon, Nov 06, 2006 at 09:50:53PM +, Simon Riggs wrote: - There are specific issues with the optimizer's ability to understand dead row numbers, which can in some cases lead to SeqScan plans that are inappropriate when tables grow because of updates. This is a red-herring that can lead to people thinking the situation is worse than it is; that needs fixing, but the core issues mentioned above remain. I don't disagree with much of what you say, but I'm slightly concerned about the wave-away answer that you give here. In my experience on high-update tables -- especially ones with the ones with few rows, but lots of open transactions over the lifetime of the row -- accurate understanding of dead rows would be a _dramatic_ improvement (perhaps at least as significant as the improvement being discussed). That said, I'm not opposed to the line you're taking. I just don't want this problem to sink forever, because it's a big problem. A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.2 Beta 3 Now Available for Download / Testing ...
Heikki Linnakangas wrote: Andrew Dunstan wrote: Marc G. Fournier wrote: As a result of there being two *known* outstanding bugs, we have just bundled up a Beta3, to allow for testing of the recent patch concerning WAL replay ... What are the bugs? AFAIK: 1. Tuple freezing and hint bits were not WAL-logged, which could lead to data corruption if database crashed after VACUUM. See recent thread titled WAL logging freezing. 2. B-tree page deletion bug, which could lead to VACUUM erroring with failed to re-find parent key message. See thread Nasty btree deletion bug: http://archives.postgresql.org/pgsql-hackers/2006-10/msg01373.php I assume that we also want to fix the plperl/plperlu interaction issue, especially as it will result in changed behaviour on some platforms, at least. I have asked a couple of people to look over and test the patch I posted on Sunday, but i have not had any reaction so far and feel slightly nervous about applying it. I will make some time to do more testing myself this weekend. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Introducing an advanced Frequent Update Optimization
On Thu, 2006-11-09 at 04:09 -0500, Andrew Sullivan wrote: On Mon, Nov 06, 2006 at 09:50:53PM +, Simon Riggs wrote: - There are specific issues with the optimizer's ability to understand dead row numbers, which can in some cases lead to SeqScan plans that are inappropriate when tables grow because of updates. This is a red-herring that can lead to people thinking the situation is worse than it is; that needs fixing, but the core issues mentioned above remain. I don't disagree with much of what you say, but I'm slightly concerned about the wave-away answer that you give here. In my experience on high-update tables -- especially ones with the ones with few rows, but lots of open transactions over the lifetime of the row -- accurate understanding of dead rows would be a _dramatic_ improvement (perhaps at least as significant as the improvement being discussed). Understood. I just wanted to make sure people understand that the underlying problem would still be there even if we fix that. That said, I'm not opposed to the line you're taking. I just don't want this problem to sink forever, because it's a big problem. So, yeh, we should still fix that. The current prototype has a different cost model for SeqScans as a result. Summary arriving anytime now. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Frequent Update Project: Design Overview of HOT Updates
Design Overview of HOT Updates -- The objective is to increase the speed of the UPDATE case, while minimizing the overall negative effects of the UPDATE. We refer to the general requirement as *Frequent Update Optimization*, though this design proposal is for Heap Overflow Tuple (HOT) Updates. It is similar in some ways to the design for SITC already proposed, though has a number of additional features drawn from other designs to make it a practical and effective implementation. EnterpriseDB have a working, performant prototype of this design. There are still a number of issues to resolve and the intention is to follow an open community process to find the best way forward. All required detail will be provided for the work conducted so far. Current PGSQL behaviour is for UPDATEs to create a new tuple version within the heap, so acts from many perspectives as if it were an INSERT. All of the tuple versions are chained together, so that whichever of the tuples is visible to your Snapshot, you can walk the chain to find the most recent tuple version to update. The HOT proposal splits the heap into two areas: the main heap and an overflow relation, both of which are regular, transactional relations. INSERT and DELETE effect only the main heap exactly as they do now. UPDATE places new tuple versions into the overflow relation in most cases, maintaining all of the current capabilities of the MVCC model: all tuple versions remain accessible, plus the chain of updated tuple versions is maintained. UPDATEs do not insert into indexes at all - no index pointers exist at all into the overflow relation. So during a stream of UPDATEs the main heap and indexes stay the same size, while the indexes are used read-only, avoiding additional database writes and the need for eventual index rebuilding. The current tuple version within the main heap is referred to as the root tuple from now on. When reading the main heap, if the root tuple is not visible we walk the chain into the overflow relation until we find a tuple version that is visible - we follow the t_ctid pointer from tuple to tuple, testing for MVCC visibility as we go. As more UPDATEs take place these tuple chains would grow, making locating the latest tuple take progressively longer. Intuitively this sounds like a bad design, though an additional feature turns that from bad to good performance: - when anybody walks the chain into the overflow relation, if we find that the root tuple is vacuumable we copy back the first visible tuple version over the now-dead root tuple. This allows the length of a typical tuple chain to be extremely short in practice. For a single connection issuing a stream of UPDATEs the chain length will no more than 1 at any time. Even under heavy concurrent UPDATEs the modal chain length remains 1, with the chain length varying in approximately Poisson distribution. The overflow relation is similar in concept to a TOAST table, so we might describe this approach as TOAST-for-updated-versions. The code implementation is similar also, with reasonably similar modularity. HOT does sound radical, but no more so than TOAST was when first discussed. We can locate any tuple version and thereby allow MVCC to work correctly, while at the same time preserving the crucial property of the Postgres non-overwriting storage manager. This works very well for indexed tuple access since the index and main heap never grow, thus maintaining access speeds. HOT supports both Read Committed and Serializable transaction isolation levels, with transactions of any length, just as with current MVCC. Performance results against the previously described test cases are approximately: 1. pgbench (TPC-B) - Around 200-300% better 2. DBT-2 (TPC-C) - Signficicant improvement - possibly 10% improvement, somewhat difficult to measure exactly because of the way the test itself works. 3. truckin - Around 500% improvement The performance gain remains better than REL8_2 base even in the presence of longer running transactions. [There is also considerable synergy with changes to b-tree indexes that are both useful in their own right, and even better when HOT is added, more on that on a separate thread.] We expect that people will want to measure this for themselves, so we don't publish all of the detailed internal tests here since YMMV. Using HOT - HOT can only work in cases where a tuple does not modify one of the columns defined in an index on the table, and when we do not alter the row length of the tuple. [We'll be able to do that more efficiently when we have plan invalidation] If we perform an update that meets the HOT criteria then we put the new version into the overflow relation; we describe this as a HOT UPDATE. If we perform an update that does not meet the criteria, then we carry on with the existing/old MVCC behaviour; we describe this as a non-HOT UPDATE. So with HOT we must support both HOT and non-HOT UPDATEs.
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
Nice idea, just one question: On Thu, Nov 09, 2006 at 05:13:16PM +, Simon Riggs wrote: Behavioural Characteristics --- With HOT, it is easily possible that the chain of prior versions spans many blocks. The chain always starts with the block of the root tuple but possibly includes many overflow relation blocks. A SeqScan of a HOT table will turn into a large number of random accesses to the overflow relation, which could be considerably more expensive than sequential I/O. Clearly very large tables would not be able to be HOT enabled without additional cost, so we make HOT a table-level option: WITH (freqUPDATE = on) [discuss...] It seems to me that bitmap index scans will get these same characteristics also, right? The bitmap scan will have to follow the chain of any possibly matching tuple in any of the blocks that are in the bitmap, right? Have a ncie day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Frequent Update Project: Design Overview of HOTUpdates
On Thu, 2006-11-09 at 18:49 +0100, Martijn van Oosterhout wrote: Nice idea, just one question: It seems to me that bitmap index scans will get these same characteristics also, right? The bitmap scan will have to follow the chain of any possibly matching tuple in any of the blocks that are in the bitmap, right? Yes, they would identify the root tuples. The whole chain has matching index values, by definition, so the re-evaluation for lossy bitmaps will work just the same before the actual tuple is retrieved by walking the chain (if required). -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Introducing an advanced Frequent Update Optimization
On Tue, 2006-11-07 at 15:00 +1300, Mark Kirkwood wrote: Simon Riggs wrote: EnterpriseDB has been running a research project to improve the performance of heavily updated tables. We have a number of approaches prototyped and we'd like to discuss the best of these now on -hackers for community input and patch submission to PostgreSQL core. Excellent! It would certainly be good for use cases like: - session data from web (or similar) applications - real time summary tables maintained by triggers to just work, as (certainly in the case of the first one) quite a few folks have been bitten by exactly the issue you describe. ...and of course it would be good if LISTEN/NOTIFY were able to use this concept also, to help Slony along also. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
Simon, If we perform an update that meets the HOT criteria then we put the new version into the overflow relation; we describe this as a HOT UPDATE. If we perform an update that does not meet the criteria, then we carry on with the existing/old MVCC behaviour; we describe this as a non-HOT UPDATE. Making the essential performance analysis question, Am I HOT or Not? Sorry, but someone had to say it. ;-) -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Introducing an advanced Frequent Update Optimization
Quoth [EMAIL PROTECTED] (Simon Riggs): On Tue, 2006-11-07 at 15:00 +1300, Mark Kirkwood wrote: Simon Riggs wrote: EnterpriseDB has been running a research project to improve the performance of heavily updated tables. We have a number of approaches prototyped and we'd like to discuss the best of these now on -hackers for community input and patch submission to PostgreSQL core. Excellent! It would certainly be good for use cases like: - session data from web (or similar) applications - real time summary tables maintained by triggers to just work, as (certainly in the case of the first one) quite a few folks have been bitten by exactly the issue you describe. ...and of course it would be good if LISTEN/NOTIFY were able to use this concept also, to help Slony along also. That should be much less relevant as people migrate to version 1.2, as 1.2 uses LISTEN/NOTIFY a whole lot less than earlier versions: 1. Number of events generated is cut in 1/2 because we don't generate NOTIFIES for confirmations anymore 2. When a thread is busy, it shuts off LISTEN, and polls. That will cut pg_listener bloat further... -- let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];; http://cbbrowne.com/info/internet.html Ah, fall - when leaves turn to burnished colors upon darkling branches, collars are turned up against a wind which murmurs of winter, and homework assignments appear on Usenet. sigh -- Bob Jarvis ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] plperl/plperlu interaction
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Anyway, it is probably not expected by many users that loading a module in plperlu makes it available to plperl - I was slightly surprised myself to see it work and I am probably more aware than most of perl and plperl subtleties. I think that is a bug and needs to be fixed. We have the precedent of pltcl, which uses separate interpreters for pltcl and pltclu for exactly this reason. If this is fixed, what becomes the mechanism for an administrator to make a perl module available to plperl functions? I didn't see any other way to do this documented. Thanks, mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Frequent Update Project: Design Overview of HOTUpdates
On Thu, 2006-11-09 at 13:21 -0800, Josh Berkus wrote: Simon, If we perform an update that meets the HOT criteria then we put the new version into the overflow relation; we describe this as a HOT UPDATE. If we perform an update that does not meet the criteria, then we carry on with the existing/old MVCC behaviour; we describe this as a non-HOT UPDATE. Making the essential performance analysis question, Am I HOT or Not? Very good. ;-) Well, we had Overflow Update CHaining as an alternative name... :-) The naming sounds silly, but we had a few alternate designs, so we needed to be able to tell them apart sensibly. We've had TVR, SITC, UIP and now HOT. Software research... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Introducing an advanced Frequent Update Optimization
Simon Riggs [EMAIL PROTECTED] writes: ...and of course it would be good if LISTEN/NOTIFY were able to use this concept also, to help Slony along also. LISTEN/NOTIFY are overdue to be rewritten to not use a table at all, so I'm not particularly worried about whether this idea is applicable to them. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
Simon Riggs [EMAIL PROTECTED] writes: As more UPDATEs take place these tuple chains would grow, making locating the latest tuple take progressively longer. This is the part that bothers me --- particularly the random-access nature of the search. I wonder whether you couldn't do something involving an initial table fill-factor of less than 50%, and having the first updated version living on the same heap page as its parent. Only when the active chain length was more than one (which you hypothesize is rare) would it actually be necessary to do a random access into the overflow table. More generally, do we need an overflow table at all, rather than having these overflow tuples living in the same file as the root tuples? As long as there's a bit available to mark a tuple as being this special not-separately-indexed type, you don't need a special location to know what it is. This might break down in the presence of seqscans though. Actually, you omitted to mention the locking aspects of moving tuples around --- exactly how are you going to make that work without breaking concurrent scans? This allows the length of a typical tuple chain to be extremely short in practice. For a single connection issuing a stream of UPDATEs the chain length will no more than 1 at any time. Only if there are no other transactions being held open, which makes this claim a lot weaker. HOT can only work in cases where a tuple does not modify one of the columns defined in an index on the table, and when we do not alter the row length of the tuple. Seems like altering the row length isn't the issue, it's just is there room on the page for the new version. Again, a generous fillfactor would give you more flexibility. [We'll be able to do that more efficiently when we have plan invalidation] Uh, what's that got to do with it? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
Tom, Actually, you omitted to mention the locking aspects of moving tuples around --- exactly how are you going to make that work without breaking concurrent scans? I believe that's the unsolved technical issue in the prototype, unless Pavan has solved it in the last two weeks. Pavan? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Various breakages in new contrib/isn module
It occurred to me to run the regression tests type_sanity and opr_sanity over the contrib/isn code. (The easy way to do this is to copy the isn install script into the regress/sql directory and then add it to serial_schedule just before those two tests.) It turned up a couple of moderately serious problems: * There are a whole bunch of shell operators created; try select oid::regoperator from pg_operator where oprcode = 0; after loading isn. I didn't track it down in detail, but it looked like most or all of these come from dangling oprcom links, ie, there's an operator that claims to have a commutator but you never supplied one. This is very bad, because the planner *will* try to use those operators given the right kind of query. * There are hash opclasses for these datatypes but the corresponding equality operators are not marked hashable. This is not quite as bad, but should be fixed. Please submit a patch that fixes these. Note to hackers: it might be worth trying the same thing with the other contrib modules; I don't have time right now though. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] plperl/plperlu interaction
Mark Dilger wrote: Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Anyway, it is probably not expected by many users that loading a module in plperlu makes it available to plperl - I was slightly surprised myself to see it work and I am probably more aware than most of perl and plperl subtleties. I think that is a bug and needs to be fixed. We have the precedent of pltcl, which uses separate interpreters for pltcl and pltclu for exactly this reason. If this is fixed, what becomes the mechanism for an administrator to make a perl module available to plperl functions? I didn't see any other way to do this documented. Thanks, This isn't documented either :-) I discovered this when I was working on a way of doing this nicely and safely. I hope to have that for 8.3. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
On 11/10/06, Josh Berkus josh@agliodbs.com wrote: Tom, Actually, you omitted to mention the locking aspects of moving tuples around --- exactly how are you going to make that work without breaking concurrent scans?I believe that's the unsolved technical issue in the prototype, unless Pavan has solved it in the last two weeks. Pavan?When an overflow tuple is copied back to the main heap, the overflow tuple ismarked with a special flag (HEAP_OVERFLOW_MOVEDBACK). Subsequently, when a backend tries to lock the overflow version of the tuple, it checks the flagand jumps to the main heap if the flag is set.We use the same technique to update the correct version of a tuple when a tuple is moved back to the main heap.Regards,Pavan
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
Pavan Deolasee [EMAIL PROTECTED] writes: On 11/10/06, Josh Berkus josh@agliodbs.com wrote: I believe that's the unsolved technical issue in the prototype, unless Pavan has solved it in the last two weeks. Pavan? When an overflow tuple is copied back to the main heap, the overflow tuple is marked with a special flag (HEAP_OVERFLOW_MOVEDBACK). Subsequently, when a backend tries to lock the overflow version of the tuple, it checks the flag and jumps to the main heap if the flag is set. (1) How does it jump to the main heap? The links go the other direction. (2) Isn't this full of race conditions? (3) I thought you already used up the one remaining t_infomask bit. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
On 11/10/06, Tom Lane [EMAIL PROTECTED] wrote: Pavan Deolasee [EMAIL PROTECTED] writes: On 11/10/06, Josh Berkus josh@agliodbs.com wrote: I believe that's the unsolved technical issue in the prototype, unless Pavan has solved it in the last two weeks. Pavan? When an overflow tuple is copied back to the main heap, the overflow tuple is marked with a special flag (HEAP_OVERFLOW_MOVEDBACK). Subsequently, when a backend tries to lock the overflow version of the tuple, it checks the flag and jumps to the main heap if the flag is set. (1) How does it jump to the main heap?The links go the otherdirection.The overflow tuple has a special header to store the back pointer to the main heap.This increases the tuple header size by 6 bytes, but the overhead is restricted only to the overflow tuples.(2) Isn't this full of race conditions?I agree, there could be race conditions. But IMO we can handle those. When we follow the tuple chain, we hold a SHARE lock on the main heap buffer. Also, whenthe root tuple is vacuumable and needs to be overwritten, we acquire and keep EXCLUSIVElock on the main heap buffer. This reduces the race conditions to a great extent. (3) I thought you already used up the one remaining t_infomask bit.Yes. The last bit in the t_infomask is used up to mark presence of overflow tuple header. But I believe there are few more bits that can be reused. There are three bits available in the t_ctid field as well (since ip_posid needs maximum 13 bits). One bit is used to identify whether a given tid points to the main heap or the overflow heap. This helps when tids are passed around in the code. Since the back pointer from the overflow tuple always points to the main heap, the same bit can be used to mark copied-back tuples (we are doing it in a slight different way in the current prototype though). Regards,Pavan
Re: [HACKERS] Frequent Update Project: Design Overview of HOT
Ühel kenal päeval, N, 2006-11-09 kell 18:28, kirjutas Tom Lane: Simon Riggs [EMAIL PROTECTED] writes: As more UPDATEs take place these tuple chains would grow, making locating the latest tuple take progressively longer. This is the part that bothers me --- particularly the random-access nature of the search. I wonder whether you couldn't do something involving an initial table fill-factor of less than 50%, and having the first updated version living on the same heap page as its parent. Only when the active chain length was more than one (which you hypothesize is rare) would it actually be necessary to do a random access into the overflow table. More generally, do we need an overflow table at all, rather than having these overflow tuples living in the same file as the root tuples? As long as there's a bit available to mark a tuple as being this special not-separately-indexed type, you don't need a special location to know what it is. This might break down in the presence of seqscans though. And why do you need to mark it as not-separately-indexed at all ? We already cope with missing index pointers in VACUUM and I can't see any other reason to have it. What are the advantages of HOT over SITC (other than cool name) ? Maybe just make HOT an extended SITC which can span pages. In case of HOT together with reusing index tuples with DELETED bit set we don't actually need copyback, but the same index pointer will follow the head of live data automatically, maybe lagging only a small number of versions. Actually, you omitted to mention the locking aspects of moving tuples around --- exactly how are you going to make that work without breaking concurrent scans? This allows the length of a typical tuple chain to be extremely short in practice. For a single connection issuing a stream of UPDATEs the chain length will no more than 1 at any time. Only if there are no other transactions being held open, which makes this claim a lot weaker. HOT can only work in cases where a tuple does not modify one of the columns defined in an index on the table, and when we do not alter the row length of the tuple. Seems like altering the row length isn't the issue, it's just is there room on the page for the new version. Again, a generous fillfactor would give you more flexibility. Maybe they hoped to take very light locks when new chaoin head is copied iver the old one in the same-length case. http://www.postgresql.org/about/donate -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com NOTICE: This communication contains privileged or other confidential information. If you have received it in error, please advise the sender by reply email and immediately delete the message and any attachments without copying or disclosing the contents. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] beta3 CFLAGS issue on openbsd
I was trying to compile 8.2beta3 on openbsd, and ran into an interesting issue. My account on the particular openbsd box has some restrictive ulimit settings, so I don't have a lot of memory to work with. I was getting an out of memory issue linking postgres, while I did not before. I figured out that the -g flag was being surreptitiously added to my CFLAGS. It was like pulling teeth trying to get the -g flag out. I tried --disable-debug to configure, which did not work. I had to do CFLAGS=-O2 ./configure ... Is this a known feature in the betas to get people running with -g in case things break, or is this a configure bug, or expected? Here is the first bit from configure, note the -g in the using CFLAGS line at the end. [EMAIL PROTECTED](~/build/postgres/postgresql-8.2beta3)$ ./configure --prefix=/home/jeremyd/progs/pg82 --with-perl --with-openssl --with-pgport=54322 checking build system type... x86_64-unknown-openbsd3.9 checking host system type... x86_64-unknown-openbsd3.9 checking which template to use... openbsd checking whether to build with 64-bit integer date/time support... no checking whether NLS is wanted... no checking for default port number... 54322 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether cc accepts -g... yes checking for cc option to accept ANSI C... none needed checking if cc supports -Wdeclaration-after-statement... no checking if cc supports -Wendif-labels... yes checking if cc supports -fno-strict-aliasing... yes configure: using CFLAGS=-O2 -g -pipe -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wendif-labels -fno-strict-aliasing -- It's odd, and a little unsettling, to reflect upon the fact that English is the only major language in which I is capitalized; in many other languages You is capitalized and the i is lower case. -- Sydney J. Harris ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Frequent Update Project: Design Overview of HOT Updates
Pavan Deolasee [EMAIL PROTECTED] writes: On 11/10/06, Tom Lane [EMAIL PROTECTED] wrote: (2) Isn't this full of race conditions? I agree, there could be race conditions. But IMO we can handle those. Doubtless you can prevent races by introducing a bunch of additional locking. The question was really directed to how much concurrent performance is left, once you get done locking it down. (3) I thought you already used up the one remaining t_infomask bit. Yes. The last bit in the t_infomask is used up to mark presence of overflow tuple header. But I believe there are few more bits that can be reused. There are three bits available in the t_ctid field as well (since ip_posid needs maximum 13 bits). No, you cannot have those bits --- BLCKSZ is not limited to 8K, and even if it were, we will not hold still for sticking flag bits into an unrelated datatype. You can probably fix this by inventing multiple context-dependent interpretations of t_infomask bits, but that strikes me as a mighty ugly and fragile way to proceed. (Actually, the assumption that you can throw an additional back-pointer into overflow tuple headers is the worst feature of this proposal in that regard --- it's really not that easy to support multiple header formats.) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Frequent Update Project: Design Overview of HOT
Ühel kenal päeval, R, 2006-11-10 kell 09:06, kirjutas Hannu Krosing: Ühel kenal päeval, N, 2006-11-09 kell 18:28, kirjutas Tom Lane: Simon Riggs [EMAIL PROTECTED] writes: As more UPDATEs take place these tuple chains would grow, making locating the latest tuple take progressively longer. This is the part that bothers me --- particularly the random-access nature of the search. I wonder whether you couldn't do something involving an initial table fill-factor of less than 50%, and having the first updated version living on the same heap page as its parent. Only when the active chain length was more than one (which you hypothesize is rare) would it actually be necessary to do a random access into the overflow table. More generally, do we need an overflow table at all, rather than having these overflow tuples living in the same file as the root tuples? As long as there's a bit available to mark a tuple as being this special not-separately-indexed type, you don't need a special location to know what it is. This might break down in the presence of seqscans though. And why do you need to mark it as not-separately-indexed at all ? We already cope with missing index pointers in VACUUM and I can't see any other reason to have it. Ok, now I see it - we can't VACUUM a tuple, if next versions of it are accessible by t_ctid chain only. That is vacuum must not free tuples, which have t_ctid pointing to a tuple that has not-separately-indexed bit set. This seems to make vacuum quite complicated, as it has to examine c_tid chains to detect if it can free a tuple, and what's even worse, it has to examine these chains backwards. What are the advantages of HOT over SITC (other than cool name) ? still wondering this, is it just the abilty to span multiple pages ? Maybe just make HOT an extended SITC which can span pages. In case of HOT together with reusing index tuples with DELETED bit set we don't actually need copyback, but the same index pointer will follow the head of live data automatically, maybe lagging only a small number of versions. Actually, you omitted to mention the locking aspects of moving tuples around --- exactly how are you going to make that work without breaking concurrent scans? This allows the length of a typical tuple chain to be extremely short in practice. For a single connection issuing a stream of UPDATEs the chain length will no more than 1 at any time. Only if there are no other transactions being held open, which makes this claim a lot weaker. HOT can only work in cases where a tuple does not modify one of the columns defined in an index on the table, and when we do not alter the row length of the tuple. Seems like altering the row length isn't the issue, it's just is there room on the page for the new version. Again, a generous fillfactor would give you more flexibility. Maybe they hoped to take very light locks when new chaoin head is copied iver the old one in the same-length case. http://www.postgresql.org/about/donate -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com NOTICE: This communication contains privileged or other confidential information. If you have received it in error, please advise the sender by reply email and immediately delete the message and any attachments without copying or disclosing the contents. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] beta3 CFLAGS issue on openbsd
Jeremy Drake [EMAIL PROTECTED] writes: I figured out that the -g flag was being surreptitiously added to my CFLAGS. It was like pulling teeth trying to get the -g flag out. I believe that this is a default behavior of autoconf scripts. I remember having done some ugly hacks years ago to prevent an autoconf configure script from adding -g by default to libjpeg builds... and the argument for not having -g has gotten ever weaker since then, so I really doubt you'll get far complaining to the autoconf maintainers about it. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly