Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)
I wrote: OK, I think I see it. The problem is that the code in slru.c is careful about not modifying state when it doesn't hold the proper lock, but not so careful about not *inspecting* state without the proper lock. ... I'm still thinking about how to make a real fix without introducing another lock/unlock cycle --- we can do that if we have to, but I think maybe it's possible to fix it without. Attached is a proposed patch to fix up slru.c so that it's not playing any games by either reading or writing shared state without holding the ControlLock for the SLRU set. The main problem with the existing code, as I now see it, was a poor choice of representation of page state: we had states CLEAN, DIRTY, and WRITE_IN_PROGRESS, which meant that re-dirtying a page while a write was in progress required setting the state back to DIRTY, which hid the fact that indeed a write was in progress. So the I/O code was designed to cope with not knowing whether another write was already in progress. We can make it a whole lot cleaner by changing the state representation so that we can tell the difference --- then, we can know before releasing the ControlLock whether we need to write the page or just wait for someone else to finish writing it. And that means we can do all the state-manipulation before releasing the lock. I could have just added an additional state WRITE_IN_PROGRESS_REDIRTIED, or some such, but it seemed notationally cleaner to create a separate page_dirty boolean, and reduce the number of states to four (empty, read-in-progress, valid, write-in-progress). This lets outside code such as clog.c just set page_dirty = true rather than doing a complex bit of logic to change the state value properly. The patch is a bit bulky because of the representation change, but the key changes are localized in SimpleLruReadPage and SimpleLruWritePage. I think this code is a whole lot cleaner than it was before, but it's a bit of a large change to be making so late in the 8.1 cycle (not to mention that we really ought to back-patch similar changes all the way back, because the bug exists as far back as 7.2). I am going to take another look to see if there is a less invasive change that will fix the problem at some performance cost; if so, we might want to do it that way in 8.1 and back branches. Any comments on the patch, or thoughts on how to proceed? regards, tom lane binYlP4HAkp8I.bin Description: slru-race-1.patch ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags
Good analysis. I guess the question is what patch would we put into a subrelease? If you go for a new state code, rather than a separate boolean, does it reduce the size of the patch? --- Tom Lane wrote: I wrote: OK, I think I see it. The problem is that the code in slru.c is careful about not modifying state when it doesn't hold the proper lock, but not so careful about not *inspecting* state without the proper lock. ... I'm still thinking about how to make a real fix without introducing another lock/unlock cycle --- we can do that if we have to, but I think maybe it's possible to fix it without. Attached is a proposed patch to fix up slru.c so that it's not playing any games by either reading or writing shared state without holding the ControlLock for the SLRU set. The main problem with the existing code, as I now see it, was a poor choice of representation of page state: we had states CLEAN, DIRTY, and WRITE_IN_PROGRESS, which meant that re-dirtying a page while a write was in progress required setting the state back to DIRTY, which hid the fact that indeed a write was in progress. So the I/O code was designed to cope with not knowing whether another write was already in progress. We can make it a whole lot cleaner by changing the state representation so that we can tell the difference --- then, we can know before releasing the ControlLock whether we need to write the page or just wait for someone else to finish writing it. And that means we can do all the state-manipulation before releasing the lock. I could have just added an additional state WRITE_IN_PROGRESS_REDIRTIED, or some such, but it seemed notationally cleaner to create a separate page_dirty boolean, and reduce the number of states to four (empty, read-in-progress, valid, write-in-progress). This lets outside code such as clog.c just set page_dirty = true rather than doing a complex bit of logic to change the state value properly. The patch is a bit bulky because of the representation change, but the key changes are localized in SimpleLruReadPage and SimpleLruWritePage. I think this code is a whole lot cleaner than it was before, but it's a bit of a large change to be making so late in the 8.1 cycle (not to mention that we really ought to back-patch similar changes all the way back, because the bug exists as far back as 7.2). I am going to take another look to see if there is a less invasive change that will fix the problem at some performance cost; if so, we might want to do it that way in 8.1 and back branches. Any comments on the patch, or thoughts on how to proceed? regards, tom lane Content-Description: slru-race-1.patch [ Type application/octet-stream treated as attachment, skipping... ] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)
Bruce Momjian pgman@candle.pha.pa.us writes: If you go for a new state code, rather than a separate boolean, does it reduce the size of the patch? No, and it certainly wouldn't improve my level of confidence in it ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags
Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: If you go for a new state code, rather than a separate boolean, does it reduce the size of the patch? No, and it certainly wouldn't improve my level of confidence in it ... Well, then what real options do we have? It seems the patch is just required for all branches. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)
Bruce Momjian pgman@candle.pha.pa.us writes: Well, then what real options do we have? It seems the patch is just required for all branches. I think it would be possible to fix it in a less invasive way by taking and releasing the ControlLock an extra time in SimpleLruReadPage and SimpleLruWritePage. What's indeterminate about that is the performance cost. In situations where there's not a lot of SLRU I/O traffic it's presumably negligible, but in a case like Jim's where there's evidently a *whole* lot of traffic, it might be a killer. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] FKs on temp tables: hard, or just omitted?
I have applied a more limited patch that mentions this. I do not want to mention _why_ we do not implement it because it is partly performance and partly complexity, I think, and some combinations make no sense, like temporary primary and non-temp foreign. --- Jim C. Nasby wrote: On Sun, Oct 30, 2005 at 05:31:07PM -0800, Josh Berkus wrote: Folks, Thanks, all! Now, if only I could remember who asked me the question ... ISTM we should add a note about this to the docs... Here's a patch for create_table.sgml, though there's probably some other places this could go... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 [ Attachment, skipping... ] ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/ref/create_table.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/ref/create_table.sgml,v retrieving revision 1.94 diff -c -c -r1.94 create_table.sgml *** doc/src/sgml/ref/create_table.sgml 13 Aug 2005 02:48:18 - 1.94 --- doc/src/sgml/ref/create_table.sgml 31 Oct 2005 18:11:27 - *** *** 421,427 primary key of the replaceable class=parameterreftable/replaceable is used. The referenced columns must be the columns of a unique or primary ! key constraint in the referenced table. /para para --- 421,429 primary key of the replaceable class=parameterreftable/replaceable is used. The referenced columns must be the columns of a unique or primary ! key constraint in the referenced table. Note that foreign key ! constraints may not be defined between temporary tables and ! permanent tables. /para para ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags
Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: Well, then what real options do we have? It seems the patch is just required for all branches. I think it would be possible to fix it in a less invasive way by taking and releasing the ControlLock an extra time in SimpleLruReadPage and SimpleLruWritePage. What's indeterminate about that is the performance cost. In situations where there's not a lot of SLRU I/O traffic it's presumably negligible, but in a case like Jim's where there's evidently a *whole* lot of traffic, it might be a killer. To me a performance problem is much harder get reports on and to locate than a real fix to the problem. I think if a few people eyeball the patch, it is OK for application. Are backpatches significantly different? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)
Bruce Momjian pgman@candle.pha.pa.us writes: To me a performance problem is much harder get reports on and to locate than a real fix to the problem. I think if a few people eyeball the patch, it is OK for application. Are backpatches significantly different? Well, the logic is the same all the way back, but the code has changed textually quite a bit since 7.4 and even more since 7.3. I think the patch would apply reasonably cleanly to 8.0, but adjusting it for 7.x will take a bit of work, which would mean those versions would probably need to be reviewed separately. One possible compromise is to use this patch in 8.x and a simpler patch in 7.x --- people who are very concerned about performance ought to be running 8.x anyway ;-) regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)
I wrote: I think it would be possible to fix it in a less invasive way by taking and releasing the ControlLock an extra time in SimpleLruReadPage and SimpleLruWritePage. What's indeterminate about that is the performance cost. Attached is an alternative patch that does it this way. I realized that we could use LWLockConditionalAcquire to usually avoid any extra lock traffic, so the performance cost may be negligible except under the very heaviest of loads. I still like the bigger patch for 8.2 and forward, because it's a lot cleaner, but this seems like a credible alternative for 8.1 and back branches. Comments? regards, tom lane binjbJK9x6skx.bin Description: slru-race-2.patch ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags
OK, this is the way to fix for 8.0 and earlier. It is up to you about 8.1. I think we can handle the larger patch if we do another RC. --- Tom Lane wrote: I wrote: I think it would be possible to fix it in a less invasive way by taking and releasing the ControlLock an extra time in SimpleLruReadPage and SimpleLruWritePage. What's indeterminate about that is the performance cost. Attached is an alternative patch that does it this way. I realized that we could use LWLockConditionalAcquire to usually avoid any extra lock traffic, so the performance cost may be negligible except under the very heaviest of loads. I still like the bigger patch for 8.2 and forward, because it's a lot cleaner, but this seems like a credible alternative for 8.1 and back branches. Comments? regards, tom lane Content-Description: slru-race-2.patch [ Type application/octet-stream treated as attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags 0x01),)
Bruce Momjian pgman@candle.pha.pa.us writes: OK, this is the way to fix for 8.0 and earlier. It is up to you about 8.1. I think we can handle the larger patch if we do another RC. Well, I'd like not to do another RC, so I'll hold the larger patch for 8.2. We still need a test to confirm it fixes Jim's problem though. Jim, if you like you can test the second proposed patch instead of that off-the-cuff line swapping. However, either one will need to be run with Asserts on in order to have any confidence that the problem is fixed. If performance is an issue, most of the performance hit is probably coming from memory context checking, so what I'd suggest you do is comment out these two #defines in src/include/pg_config_manual.h: #define CLOBBER_FREED_MEMORY #define MEMORY_CONTEXT_CHECKING That should let you build with --enable-cassert and not take quite so much speed hit. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Partitioning docs
On Mon, 2005-10-31 at 02:46 +, Simon Riggs wrote: I've been working on some docs for Constraining Exclusion Partitioning for some time now. Deadlines seem to be looming, or may even have passed, so it seems sensible to submit what I have now. Many thanks to Josh Berkus for providing the numbered section on implementation process, which was the starting point I'd been looking for to describe everything else. I believe this is now complete and ready for application. - passes sgml make against cvstip - spellchecked - all code executed correctly against RC1 Comments please? Apart from the obvious, so why did it take you so long. Apologies to the translators. Best Regards, Simon Riggs Index: ddl.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ddl.sgml,v retrieving revision 1.45 diff -c -c -r1.45 ddl.sgml *** ddl.sgml 23 Oct 2005 19:29:49 - 1.45 --- ddl.sgml 31 Oct 2005 22:36:26 - *** *** 398,403 --- 398,410 ensure that a column does not contain null values, the not-null constraint described in the next section can be used. /para + + para + Check constraints can also be used to enhance performance with + very large tables, when used in conjunction with the + xref linkend=guc-constraint-exclusion parameter. + This is discussed in more detail in xref linkend=ce-partitioning +/para /sect2 sect2 *** *** 1040,1052 sect1 id=ddl-inherit titleInheritance/title ! remarkThis section needs to be rethought. Some of the ! information should go into the following chapters./remark para !Let's create two tables. The capitals table contains !state capitals which are also cities. Naturally, the !capitals table should inherit from cities. programlisting CREATE TABLE cities ( --- 1047,1081 sect1 id=ddl-inherit titleInheritance/title !indexterm ! primarynot-null constraint/primary !/indexterm ! !indexterm ! primaryconstraint/primary ! secondaryNOT NULL/secondary !/indexterm ! ! para !productnamePostgreSQL/productname was the first DBMS to introduce !inheritance, one of its many object-relational features. !productnamePostgreSQL/productname implements table !inheritance which can be a useful tool for database designers. !The SQL:2003 standard optionally defines type inheritance which differs !in many respects from the features described here. ! /para para !Let's start with an example: !We're trying to build a data model for cities, but we have a problem. !Each state has many cities, but only one capital. We want to be able !to quickly retrieve the capital city for any particular state. We !can solve this problem by creating two tables. The capitals table contains !state capitals, then we have another table for cities that aren't capitals. !What happens when we want to ask for data about a city, regardless of !whether it is a capital or not? We can use the inheritance feature to !help resolve this problem for us. We define the capitals table so that !it inherits from cities. programlisting CREATE TABLE cities ( *** *** 1062,1077 In this case, a row of capitals firstterminherits/firstterm all attributes (name, population, and altitude) from its parent, cities. State !capitals have an extra attribute, state, that shows their state. In !productnamePostgreSQL/productname, a table can inherit from zero or more other tables, and a query can reference either all rows of a table or all rows of a table plus all of its descendants. - -note - para - The inheritance hierarchy is actually a directed acyclic graph. - /para -/note /para para --- 1091,1102 In this case, a row of capitals firstterminherits/firstterm all attributes (name, population, and altitude) from its parent, cities. State !capitals have an extra attribute, state, that shows their state. ! /para ! para !In productnamePostgreSQL/productname, a table can inherit from zero or more other tables, and a query can reference either all rows of a table or all rows of a table plus all of its descendants. /para para *** *** 1133,1163 /para /note ! note !titleDeprecated/title !para ! In previous versions of productnamePostgreSQL/productname, the ! default behavior was not to include child tables in queries. This was ! found to be error prone and is also in violation of the SQL:2003 ! standard. Under the old syntax, to get the sub-tables you append ! literal*/literal to the table name. ! For example ! programlisting ! SELECT * from cities*; ! /programlisting ! You can still explicitly specify scanning child tables by
Re: [PATCHES] Partitioning docs
On Mon, 2005-31-10 at 22:41 +, Simon Riggs wrote: I believe this is now complete and ready for application. The changes need a fair bit of copy editing and SGML policy work, but that is probably easier to do once it has been applied. Barring any objections I'll apply the patch within 24 hours. -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Partitioning docs
Neil Conway [EMAIL PROTECTED] writes: On Mon, 2005-31-10 at 22:41 +, Simon Riggs wrote: I believe this is now complete and ready for application. The changes need a fair bit of copy editing and SGML policy work, but that is probably easier to do once it has been applied. Barring any objections I'll apply the patch within 24 hours. I'd argue for editing first and then applying. I'll take up the job if you don't have time for the editing part... I'm hoping to spend most of this week on docs editing anyway, since anything else will raise Marc's hackles about whether we need another RC ;-) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Partitioning docs
On Mon, 2005-31-10 at 23:15 -0500, Tom Lane wrote: I'd argue for editing first and then applying. I'll take up the job if you don't have time for the editing part Okay. I'll do a round of copy editing and then commit to CVS -- there will likely be room for additional improvements, so once it's in CVS anyone else who's interested can have at it. -Neil ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Partitioning docs
On Mon, 2005-10-31 at 23:27 -0500, Neil Conway wrote: On Mon, 2005-31-10 at 23:15 -0500, Tom Lane wrote: I'd argue for editing first and then applying. I'll take up the job if you don't have time for the editing part Okay. I'll do a round of copy editing and then commit to CVS -- there will likely be room for additional improvements, so once it's in CVS anyone else who's interested can have at it. Thanks guys. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly