Re: [PATCHES] [HACKERS] 8.2 features?
Joshua, So now it's MySQL users' turn to say, Sure, but speed isn't everything :-)Sure, but speed isn't everything... We can accept 02/31/2006 as a validdate. Let's see PostgreSQL do that!I got the joke :)But: it is still a problem when converting. As accepting 2006-02-31 as a valid date would require brainwashing at least the entire core team, we should find a recommended path of date migration from different universes. My idea would be to:a) declare date fields as textb) load the dump of the other dbc) add another column for the date fields, type timestampe (w/wo tz)d) try to update the column of c) with the converted field from a) e) replace the failing ones manuallyis this really best practice? especially finding the invalid ones would be challenging :(idea: sort after the textual date fields; look at hot spots (-00-00, -02-31) Are there better ideas? shall we document the best practice somewhere?Harald-- GHUM Harald Massapersuadere et programmareHarald Armin MassaReinsburgstraße 202b70197 Stuttgart0173/9409607 -Let's set so double the killer delete select all.
Re: [PATCHES] [HACKERS] 8.2 features?
Joe Conway [EMAIL PROTECTED] writes: In case you can make use of it, here's my latest. I found that I was being too aggressive at freeing the input nodes to transformExpr() in transformRangeValues() after using them. In many cases the returned node is a new palloc'd node, but in some cases it is not. Great, I'll incorporate these updates and keep plugging --- should be done today barring problems. If you have some spare cycles today, want to work on regression tests and docs? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] [HACKERS] 8.2 features?)
Gavin Sherry [EMAIL PROTECTED] writes: Is this intentional: template1=# values(1), (2); column1 - 1 2 (2 rows) You bet. VALUES is parallel to SELECT in the SQL grammar, so AFAICS it should be legal anywhere you can write SELECT. The basic productions in the spec's grammar are respectively query specification ::= SELECT [ set quantifier ] select list table expression and table value constructor ::= VALUES row value expression list and both of them link into the rest of the grammar here: simple table ::= query specification | table value constructor | explicit table There is no construct I can find in the spec grammar that allows query specification but not table value constructor. QED. Try some stuff like DECLARE c CURSOR FOR VALUES ... WHERE foo IN (VALUES ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: As far as avoiding overhead goes, here's what I'm thinking: * The Values RTE node should contain a list of lists of bare expressions, without TargetEntry decoration (you probably do not need ResTarget in the raw parse tree for VALUES, either). * The ValuesScan plan node will just reference this list-of-lists (avoiding making a copy). It will need to contain a targetlist because all plan nodes do, but the base version of that will just be a trivial Var 1, Var 2, etc. (The planner might replace that with a nontrivial targetlist in cases such as the example above.) I wanted to post an updated patch even though there are still things not working again after conversion to bare expressions. Note that I hacked enough of the executor stuff so I could test my changes on the parser area. The basic INSERT ... VALUES (...), (...), ... does work, but without DEFAULT again :-(. The good news is that from a memory and perfomance standpoint, my simple test now shows us outperforming mysql: $loopcount = 100; Postgres: multi-INSERT-at-once Elapsed time is 12 seconds ~420MB MySQL: multi-INSERT-at-once Elapsed time is 17 seconds ~600MB $loopcount = 200; Postgres: multi-INSERT-at-once Elapsed time is 29 seconds ~730MB MySQL: multi-INSERT-at-once Elapsed time is 37 seconds ~1.2GB (this one is from memory -- I didn't write it in my notes) Joe multi-insert-r18.diff.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] 8.2 features?
Joe Conway [EMAIL PROTECTED] writes: The good news is that from a memory and perfomance standpoint, my simple test now shows us outperforming mysql: Sweet ;-) I'm up to my *ss in fixing relation locking, but will get back to your thing as soon as that's done. I think you're close enough to qualify as having made the feature freeze deadline, in any case. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: The good news is that from a memory and perfomance standpoint, my simple test now shows us outperforming mysql: Sweet ;-) I love this team. Kudos! -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] 8.2 features?
On Mon, Jul 31, 2006 at 04:19:43PM -0400, Alvaro Herrera wrote: Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: The good news is that from a memory and perfomance standpoint, my simple test now shows us outperforming mysql: Sweet ;-) I love this team. Kudos! So now it's MySQL users' turn to say, Sure, but speed isn't everything :-) -- Michael Fuhr ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] 8.2 features?
Michael Fuhr wrote: On Mon, Jul 31, 2006 at 04:19:43PM -0400, Alvaro Herrera wrote: Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: The good news is that from a memory and perfomance standpoint, my simple test now shows us outperforming mysql: Sweet ;-) I love this team. Kudos! So now it's MySQL users' turn to say, Sure, but speed isn't everything :-) Sure, but speed isn't everything... We can accept 02/31/2006 as a valid date. Let's see PostgreSQL do that! Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] 8.2 features?
Joe Conway [EMAIL PROTECTED] writes: I wanted to post an updated patch even though there are still things not working again after conversion to bare expressions. I've been through the planner part of this and it looks OK (one or two small errors). I'm currently messing with a revised version of the grammar that supports putting VALUES everyplace that the spec allows, and is a bit simpler than the old one to boot: it folds VALUES and SELECT together, so we need fewer cases in the INSERT production. Of course this breaks most of what you did in the parser :-( ... I'm working on fixing that. I'm about to go out to dinner but thought I'd post the gram.y and parsenodes.h files so you could see where I'm headed. These are diffs from CVS tip, not from your patch. regards, tom lane binqHXkiTAmCE.bin Description: values-incomplete-patch.diff ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: I wanted to post an updated patch even though there are still things not working again after conversion to bare expressions. I've been through the planner part of this and it looks OK (one or two small errors). I'm currently messing with a revised version of the grammar that supports putting VALUES everyplace that the spec allows, and is a bit simpler than the old one to boot: it folds VALUES and SELECT together, so we need fewer cases in the INSERT production. Of course this breaks most of what you did in the parser :-( ... I'm working on fixing that. I'm about to go out to dinner but thought I'd post the gram.y and parsenodes.h files so you could see where I'm headed. These are diffs from CVS tip, not from your patch. Yup, I can see where you're headed. Looks nice! In case you can make use of it, here's my latest. I found that I was being too aggressive at freeing the input nodes to transformExpr() in transformRangeValues() after using them. In many cases the returned node is a new palloc'd node, but in some cases it is not. The other issue I found was that I had neglected to fixup/coerce the raw expressions ala updateTargetListEntry(). I ended up creating a somewhat simpler updateValuesExprListEntry() to use on values expression lists. I have yet to get to the similar/more general issue of coercing values expression lists to common datatypes (i.e. using select_common_type()). FWIW, here's a list of non-working cases at the moment: 8- create table inserttest (col1 int4, col2 int4 NOT NULL, col3 text default 'testing'); --doesn't work --- --wrong result insert into inserttest (col2, col3) values (23, DEFAULT), (24, DEFAULT), (25, 'hello'), (26, DEFAULT); select * from (values (3,4),(2,3)) as t1(f1,f2) join (values (3,8),(2,6)) as t2(f1,f2) using (f1); select * from (values (3,4),(2,3)) as t1(f1,f2) join (values (3,8),(2,6)) as t2(f1,f2) using (f1) where t2.f2 = 8; select * from (values (3,4),(2,3)) as t1(f1,f2) join (values (3,8),(2,6)) as t2(f1,f2) on t1.f1 = t2.f2 where t1.f1 = 3; --corrupt result but no crash select f1,f2 from (values (11,2),(26,'a'),(6,4)) as t(f1,f2) order by 1 desc; --crash select f1 from (values (1,2),(2,3)) as t(f1,f2) order by 1 desc; select f1,f2 from (values (11,'a'),(26,13),(6,'c')) as t(f1,f2) order by 1 desc; 8- Joe multi-insert-r19.diff.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] 8.2 features?
Joe Conway wrote: Tom Lane wrote: I thought Joe was off in a corner doing a whole new version. (I'm willing to help if he needs help...) Yeah, I was going to post the latest tonight. Sorry for the delay. Ever see the movie The Money Pit? This afternoon I started to think I lived in that house :-( Anyway, as mentioned below, I think the attached works well for the INSERT ... VALUES (...), (...), ... and related cases. There are still things wrong that I have not even tried to fix with respect to FROM clause VALUES lists. Namely column aliases have no effect, and neither does ORDER BY clause (I'm pretty sure addRangeTableEntryForValues needs work among other places). From a memory usage standpoint, I got the following using 1,000,000 values targetlists: sql length = 632 NOTICE: enter transformInsertStmt MessageContext: 478142520 total in 66 blocks; 5750400 free (3 chunks); 472392120 used NOTICE: enter transformRangeValues MessageContext: 478142520 total in 66 blocks; 5749480 free (6 chunks); 472393040 used NOTICE: enter updateTargetListEntry MessageContext: 629137464 total in 84 blocks; 44742464 free (91 chunks); 584395000 used NOTICE: exit transformInsertStmt MessageContext: 629137464 total in 84 blocks; 44742408 free (91 chunks); 584395056 used NOTICE: start ExecInitValuesScan MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks); 1008399424 used NOTICE: end ExecInitValuesScan MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks); 1008399424 used ExecutorState: 8024632 total in 3 blocks; 21256 free (8 chunks); 8003376 used This shows original SQL statement is about 6MB, by the time we get to parse analysis we're at almost 500 MB, and that memory is never recovered. Transforming from ResTarget to TargetEntry chews up about 100MB. Then between exiting transformInsertStmt and entering ExecInitValuesScan we double in memory usage to about 1 GB. It isn't shown here, but we add another 200 MB or so during tuple projection. So we top out at about 1.2 GB. Note that mysql tops out at about 600 MB for this same SQL. I'm not sure what if anything can be done to improve the above -- I'm open to suggestions. Please note that this patch requires an initdb, although I have not yet bothered to bump CATVERSION. Thanks for help, comments, suggestions, etc... Joe I'm afraid though that after 2 or so days heading down the last path you suggested (namely making a new jointree leaf node) I was having trouble, and at the same time came to the conclusion that adding a new RTE was alot cleaner and made more sense to me. So I'm hoping you won't want to send me back to the drawing board again. I believe I have cleaned up the things you objected to: 1. Now I'm not doing both alternative -- the targetlists are only attached to the RTE from the point of parse analysis onward. 2. I've eliminated the tuplestore in favor of runtime evaluation of the targetlists which are in an array (allowing forward or backward scanning -- although I haven't tested the latter yet). I've also solved the INSERT related issues that I had earlier: 1. Fixed the rules regression test -- now all regression tests pass 2. Fixed evaluation of DEFAULT values 3. Improved memory consumption and speed some more -- basically we are approximately equal to mysql as long as we don't swap, and we consume about twice the RAM as mysql instead of several times as much. I have more analysis of memory use I'd also like to share later. 4. I think the INSERT part of this is ready to go basically, but I need a bit more time to test corner cases. I've made some progress on SELECT ... FROM (VALUES ...) AS ... 1. No more shift/reduce issues 2. The ValuesScan work and memory improvements mentioned above applies here too. 3. This part still needs the most work though. I'll post a patch in a few hours -- there is some debug code in there currently that I should clean up before I send it to the list. BTW, I'm reserving Saturday, Sunday, and Monday (taking Monday off from my day job) to work on outstanding issues. I can continue to work through the end of next Friday, 4 August. After that I'm heading to Germany on a business trip and my spare time will evaporate for a few weeks. multi-insert-r17.diff.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] 8.2 features?
Joe Conway [EMAIL PROTECTED] writes: I'm afraid though that after 2 or so days heading down the last path you suggested (namely making a new jointree leaf node) I was having trouble, and at the same time came to the conclusion that adding a new RTE was alot cleaner and made more sense to me. So I'm hoping you won't want to send me back to the drawing board again. I believe I have cleaned up the things you objected to: I was just objecting to having both a new RTE type and a new jointree node type --- you only need one or the other. Opting for the new RTE type is fine with me, and it probably is a bit cleaner at the end of the day. I still dislike the way you're doing things in the executor though. I don't see the point of using the execScan.c machinery; most of the time that'll be useless overhead. As I said before, I think the right direction here is to split Result into two single-purpose node types and make the non-filter version capable of taking a list of targetlists. As far as reducing memory use goes, it seems to me that there's no need for the individual targetlists to have ResTarget/TargetEntry decoration. For the simple case where the expressions are just Const nodes, this could save something like a third of the space (there's also a List node per item, which we can't do much about). I think we'd have to gin up a fake targetlist to attach to the Plan node, but there'd be only one. Since the result-node split is my hot button, I'm willing to volunteer to make it happen. Do you want to concentrate on the remaining parser-area issues and leave the executor part to me? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: I'm afraid though that after 2 or so days heading down the last path you suggested (namely making a new jointree leaf node) I was having trouble, and at the same time came to the conclusion that adding a new RTE was alot cleaner and made more sense to me. So I'm hoping you won't want to send me back to the drawing board again. I believe I have cleaned up the things you objected to: I was just objecting to having both a new RTE type and a new jointree node type --- you only need one or the other. Opting for the new RTE type is fine with me, and it probably is a bit cleaner at the end of the day. Great! I still dislike the way you're doing things in the executor though. I don't see the point of using the execScan.c machinery; most of the time that'll be useless overhead. As I said before, I think the right direction here is to split Result into two single-purpose node types and make the non-filter version capable of taking a list of targetlists. OK. As far as reducing memory use goes, it seems to me that there's no need for the individual targetlists to have ResTarget/TargetEntry decoration. For the simple case where the expressions are just Const nodes, this could save something like a third of the space (there's also a List node per item, which we can't do much about). I think we'd have to gin up a fake targetlist to attach to the Plan node, but there'd be only one. OK, I'll take a look at that (actually I was just in that general vicinity anyway). Since the result-node split is my hot button, I'm willing to volunteer to make it happen. Do you want to concentrate on the remaining parser-area issues and leave the executor part to me? Sure, sounds good to me. Joe ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 8.2 features?
Are you going to apply this? Seems it is ready. --- Joe Conway wrote: Tom Lane wrote: Christopher Kings-Lynne [EMAIL PROTECTED] writes: Strange. Last time I checked I thought MySQL dump used 'multivalue lists in inserts' for dumps, for the same reason that we use COPY I think Andrew identified the critical point upthread: they don't try to put an unlimited number of rows into one INSERT, only a megabyte or so's worth. Typical klugy-but-effective mysql design approach ... OK, so given that we don't need to be able to do 1 million multi-targetlist insert statements, here is rev 2 of the patch. It is just slightly more invasive, but performs *much* better. In fact, it can handle as many targetlists as you have memory to deal with. It also deals with DEFAULT values in the targetlist. I've attached a php script that I used to do crude testing. Basically I tested 3 cases in this order: single-INSERT-multi-statement: -- INSERT INTO foo2a (f1,f2) VALUES (1,2); -- repeat statement $loopcount times single-INSERT-at-once: -- INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)... -- build a single SQL string by looping $loopcount times, -- and execute it all at once multi-INSERT-at-once: - INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)... -- build a single SQL string by looping $loopcount times, -- and execute it all at once Here are the results: $loopcount = 10; single-INSERT-multi-statement Elapsed time is 34 seconds single-INSERT-at-once Elapsed time is 7 seconds multi-INSERT-at-once Elapsed time is 4 seconds about 370MB peak memory usage $loopcount = 20; single-INSERT-multi-statement Elapsed time is 67 seconds single-INSERT-at-once Elapsed time is 12 seconds multi-INSERT-at-once Elapsed time is 9 seconds about 750MB peak memory usage $loopcount = 30; single-INSERT-multi-statement Elapsed time is 101 seconds single-INSERT-at-once Elapsed time is 18 seconds multi-INSERT-at-once Elapsed time is 13 seconds about 1.1GB peak memory usage Somewhere beyond this, my machine goes into swap hell, and I didn't have the patience to wait for it to complete :-) It would be interesting to see a side-by-side comparison with MySQL since that seems to be our benchmark on this feature. I'll try to do that tomorrow if no one beats me to it. There is only one downside to the current approach that I'm aware of. The command-result tag is only set by the original query, meaning that even if you insert 300,000 rows using this method, the command-result tag looks like INSERT 0 1; e.g.: regression=# create table foo2(f1 int default 42,f2 int default 6); CREATE TABLE regression=# insert into foo2 (f1,f2) values (default,12),(default,10),(115,21); INSERT 0 1 regression=# select * from foo2; f1 | f2 -+ 42 | 12 42 | 10 115 | 21 (3 rows) Any thoughts on how to fix that? Thanks, Joe [ application/x-php is not supported, skipping... ] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] 8.2 features?
Bruce Momjian [EMAIL PROTECTED] writes: Are you going to apply this? Seems it is ready. I thought Joe was off in a corner doing a whole new version. (I'm willing to help if he needs help...) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Are you going to apply this? Seems it is ready. I thought Joe was off in a corner doing a whole new version. (I'm willing to help if he needs help...) OK, just checking. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] [HACKERS] 8.2 features?)
Joe Conway [EMAIL PROTECTED] writes: Good feedback -- thanks! But without the RTE, how would VALUES in the FROM clause work? Is it different from INSERT? I'm just imagining a Values node in the jointree and nothing in the rangetable. If I'm reading the spec correctly, VALUES is exactly parallel to SELECT in the grammar, which means that to use it in FROM you would need parentheses and an alias: SELECT ... FROM (SELECT ...) AS foo SELECT ... FROM (VALUES ...) AS foo ISTM that this should be represented using an RTE_SUBQUERY node in the outer query; the alias attaches to that node, not to the VALUES itself. So I don't think you need that alias field in the jointree entry either. If we stick with the plan of representing VALUES as if it were SELECT * FROM (valuesnode), then this approach would make the second query above have a structure like Query .rtable -RTE_SUBQUERY .subquery - Query .jointree - Values (leaving out a ton of detail of course, but those are the key nodes). To get this to reverse-list in the expected form, we'd need a small kluge in ruleutils.c that short-circuits the display of SELECT ... FROM etc when it sees a Values node at the top of the jointree. This seems like a fairly small price to pay for keeping Query in approximately its present form, though. One thought is that we might allow Query.jointree to point to either a FromExpr or a Values node, and disallow Values from appearing further down in the jointree (except perhaps after flattening of subqueries in the planner). The alternative is that there's a FromExpr atop the Values node in the jointree even in the simple case; which seems uglier but it might avoid breaking some code that expects the top level to always be FromExpr. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Values list-of-targetlists patch for comments (was Re: [PATCHES] [HACKERS] 8.2 features?)
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: I'm liking this too. But when you say jointree node, are you saying to model the new node type after NestLoop/MergeJoin/HashJoin nodes? These are referred to as join nodes in ExecInitNode. Or as you mentioned a couple of times, should this look more like an Append node? No, I guess I confused you by talking about the executor representation at the same time. This is really unrelated to the executor. The join tree I'm thinking of here is the data structure that dangles off Query.jointree --- it's a representation of the query's FROM clause, and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes. See the last hundred or so lines of primnodes.h for some details. The jointree is used by the planner to compute the plan node tree that the executor will run, but it's not the same thing. There are basically two ways you could go about this: 1. Make a new jointree leaf node type to represent a VALUES construct, and dangle the list of lists of expressions off that. 2. Make a new RangeTblEntry type to represent a VALUES construct, and just put a RangeTblRef to it into the jointree. The expressions dangle off the RangeTblEntry. Offhand I'm not certain which of these would be cleanest. The second way has some similarities to the way we handle set operation trees (UNION et al), so it might be worth looking at that stuff. However, being a RangeTblEntry has a lot of baggage (eg, various routines expect to find an RTE alias, column names, column types, etc) and maybe we don't need all that for VALUES. Since the feature freeze is only about a week off, I wanted to post this patch even though it is not yet ready to be applied. Executive summary: == 1. The patch is now large and invasive based on adding new node types and associated infrastructure. I modelled the nodes largely on RangeFunction and FunctionScan. 2. Performance is close enough to mysql to not be a big issue (I think, more data below) as long as the machine does not get into a memory swapping regime. Memory usage is now better, but not as good as mysql. 3. I specifically coded with the intent of preserving current insert statement behavior and code paths for current functionality. So there *should* be no performance degradation or subtle semantics changes for INSERT DEFAULT VALUES, INSERT ... VALUES (with one target list), INSERT ... SELECT Even Tom's recently discovered insert into foo values (tenk1.*) still works ;-) Performance: On my development machine (dual core amd64, 2GB RAM) I get the following results using the php script posted earlier: Postgres: - $loopcount = 10; multi-INSERT-at-once Elapsed time is 1 second $loopcount = 30; multi-INSERT-at-once Elapsed time is 5 seconds $loopcount = 50; multi-INSERT-at-once Elapsed time is 9 seconds $loopcount = 80; multi-INSERT-at-once Elapsed time is 14 seconds $loopcount = 90; multi-INSERT-at-once Elapsed time is 17 seconds $loopcount = 100; multi-INSERT-at-once Elapsed time is 42 seconds $loopcount = 200; killed after 5 minutes due to swapping MySQL: -- $loopcount = 10; multi-INSERT-at-once Elapsed time is 2 seconds $loopcount = 30; INSERT failed:Got a packet bigger than 'max_allowed_packet' bytes changed max_allowed_packet=64M multi-INSERT-at-once Elapsed time is 5 seconds $loopcount = 50; multi-INSERT-at-once Elapsed time is 8 seconds $loopcount = 80; multi-INSERT-at-once Elapsed time is 13 seconds $loopcount = 90; multi-INSERT-at-once Elapsed time is 15 seconds $loopcount = 100; multi-INSERT-at-once Elapsed time is 17 seconds $loopcount = 200; multi-INSERT-at-once Elapsed time is 36 seconds $loopcount = 300; multi-INSERT-at-once Elapsed time is 54 seconds $loopcount = 400; multi-INSERT-at-once Elapsed time is 134 seconds table value constructor: == Included in this patch is support for table value constructor in the FROM clause, e.g.: regression=# select * from {values (1,array[1,2]),(2,array[3,4])}; ?column? | array --+--- 1 | {1,2} 2 | {3,4} (2 rows) The strange syntax is a temporary hack to eliminate shift/reduce conflicts. I'm not entirely sure we want to try to support this (or something like it) for 8.2, but much of what is needed is now readily available. More on known issues next. Known Issues: = General: 1. Several comments in the patch are marked FIXME. These are areas where I was uncertain what was the right thing to do. Any advice on these specific spots would be very much appreciated. 2. I broke the rules regression test -- still need to look at what I did to mess that up. Somewhere in the reconstruction of VALUES ... according to the diff. VALUES multi-targetlist INSERTS: 3. Not yet quite sure how to get DEFAULT
Re: [PATCHES] [HACKERS] 8.2 features?
The major downside is that somewhere between 9000 and 1 VALUES-targetlists produces ERROR: stack depth limit exceeded. Perhaps for the typical use-case this is sufficient though. I'm open to better ideas, comments, objections... If the use case is people running MySQL dumps, then there will be millions of values-targetlists in MySQL dumps. Chris ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] 8.2 features?
Christopher Kings-Lynne wrote: The major downside is that somewhere between 9000 and 1 VALUES-targetlists produces ERROR: stack depth limit exceeded. Perhaps for the typical use-case this is sufficient though. I'm open to better ideas, comments, objections... If the use case is people running MySQL dumps, then there will be millions of values-targetlists in MySQL dumps. Yeah. The fabricated select hack does feel wrong to me. Taking a quick 2 minute look at the grammar it looks like a better bet would be to make InsertStmt.targetList a list of lists of values rather than just a list of values. Of course, that would make the changes more invasive. Even with that we'd still be reading the whole thing into memory ... is there a sane way to cache the inline data before statement execution? I guess we can just say that for true bulk load our supported mechanism is still just COPY, but it would be a pity to restrict a feature that is in the standard that way. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 8.2 features?
If the use case is people running MySQL dumps, then there will be millions of values-targetlists in MySQL dumps. I did some experimentation just now, and could not get mysql to accept a command longer than about 1 million bytes. It complains about Got a packet bigger than 'max_allowed_packet' bytes which seems a bit odd because max_allowed_packet is allegedly set to 16 million, but anyway I don't think people are going to be loading any million-row tables using single INSERT commands in mysql either. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] 8.2 features?
[EMAIL PROTECTED] (Christopher Kings-Lynne) writes: The major downside is that somewhere between 9000 and 1 VALUES-targetlists produces ERROR: stack depth limit exceeded. Perhaps for the typical use-case this is sufficient though. I'm open to better ideas, comments, objections... If the use case is people running MySQL dumps, then there will be millions of values-targetlists in MySQL dumps. Curiosity: How do *does* TheirSQL parse that, and not have the One Gigantic Query blow up their query parser? -- output = reverse(gro.gultn @ enworbbc) http://www.ntlug.org/~cbbrowne/unix.html JOHN CAGE (strapped to table): Do you really expect me to conduct this antiquated tonal system? LEONARD BERNSTEIN: No, Mr. Cage, I expect you to die! [With apologies to music and James Bond fans the world over...] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] 8.2 features?
Chris Browne wrote: [EMAIL PROTECTED] (Christopher Kings-Lynne) writes: The major downside is that somewhere between 9000 and 1 VALUES-targetlists produces ERROR: stack depth limit exceeded. Perhaps for the typical use-case this is sufficient though. I'm open to better ideas, comments, objections... If the use case is people running MySQL dumps, then there will be millions of values-targetlists in MySQL dumps. Curiosity: How do *does* TheirSQL parse that, and not have the One Gigantic Query blow up their query parser? Experimentation shows that mysqldump breaks up the insert into chunks. Example with 10m rows: [EMAIL PROTECTED] ~]# perl -e 'print drop table if exists foo; create table foo (x int);\n; foreach my $i (0..9_) { print insert into foo values \n; foreach my $j (0..99) { print , if $j; print (,100*$i+$j+1,); } print ;\n; } ' gg [EMAIL PROTECTED] ~]# mysql test gg [EMAIL PROTECTED] ~]# mysqldump test foo aa [EMAIL PROTECTED] ~]# mysql test aa [EMAIL PROTECTED] ~]# grep INSERT aa | wc -l 104 cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] 8.2 features?
I did some experimentation just now, and could not get mysql to accept a command longer than about 1 million bytes. It complains about Got a packet bigger than 'max_allowed_packet' bytes which seems a bit odd because max_allowed_packet is allegedly set to 16 million, but anyway I don't think people are going to be loading any million-row tables using single INSERT commands in mysql either. Strange. Last time I checked I thought MySQL dump used 'multivalue lists in inserts' for dumps, for the same reason that we use COPY ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] 8.2 features?
I did some experimentation just now, and could not get mysql to accept a command longer than about 1 million bytes. It complains about Got a packet bigger than 'max_allowed_packet' bytes which seems a bit odd because max_allowed_packet is allegedly set to 16 million, but anyway I don't think people are going to be loading any million-row tables using single INSERT commands in mysql either. Ah no, I'm mistaken. It's not by default in mysqldump, but it does seem recommended. This is from man mysqldump: -e|--extended-insert Allows utilization of the new, much faster INSERT syntax. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] 8.2 features?
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Strange. Last time I checked I thought MySQL dump used 'multivalue lists in inserts' for dumps, for the same reason that we use COPY I think Andrew identified the critical point upthread: they don't try to put an unlimited number of rows into one INSERT, only a megabyte or so's worth. Typical klugy-but-effective mysql design approach ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] 8.2 features?
Tom Lane wrote: Christopher Kings-Lynne [EMAIL PROTECTED] writes: Strange. Last time I checked I thought MySQL dump used 'multivalue lists in inserts' for dumps, for the same reason that we use COPY I think Andrew identified the critical point upthread: they don't try to put an unlimited number of rows into one INSERT, only a megabyte or so's worth. Typical klugy-but-effective mysql design approach ... OK, so given that we don't need to be able to do 1 million multi-targetlist insert statements, here is rev 2 of the patch. It is just slightly more invasive, but performs *much* better. In fact, it can handle as many targetlists as you have memory to deal with. It also deals with DEFAULT values in the targetlist. I've attached a php script that I used to do crude testing. Basically I tested 3 cases in this order: single-INSERT-multi-statement: -- INSERT INTO foo2a (f1,f2) VALUES (1,2); -- repeat statement $loopcount times single-INSERT-at-once: -- INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)... -- build a single SQL string by looping $loopcount times, -- and execute it all at once multi-INSERT-at-once: - INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)... -- build a single SQL string by looping $loopcount times, -- and execute it all at once Here are the results: $loopcount = 10; single-INSERT-multi-statement Elapsed time is 34 seconds single-INSERT-at-once Elapsed time is 7 seconds multi-INSERT-at-once Elapsed time is 4 seconds about 370MB peak memory usage $loopcount = 20; single-INSERT-multi-statement Elapsed time is 67 seconds single-INSERT-at-once Elapsed time is 12 seconds multi-INSERT-at-once Elapsed time is 9 seconds about 750MB peak memory usage $loopcount = 30; single-INSERT-multi-statement Elapsed time is 101 seconds single-INSERT-at-once Elapsed time is 18 seconds multi-INSERT-at-once Elapsed time is 13 seconds about 1.1GB peak memory usage Somewhere beyond this, my machine goes into swap hell, and I didn't have the patience to wait for it to complete :-) It would be interesting to see a side-by-side comparison with MySQL since that seems to be our benchmark on this feature. I'll try to do that tomorrow if no one beats me to it. There is only one downside to the current approach that I'm aware of. The command-result tag is only set by the original query, meaning that even if you insert 300,000 rows using this method, the command-result tag looks like INSERT 0 1; e.g.: regression=# create table foo2(f1 int default 42,f2 int default 6); CREATE TABLE regression=# insert into foo2 (f1,f2) values (default,12),(default,10),(115,21); INSERT 0 1 regression=# select * from foo2; f1 | f2 -+ 42 | 12 42 | 10 115 | 21 (3 rows) Any thoughts on how to fix that? Thanks, Joe Index: src/backend/parser/analyze.c === RCS file: /cvsroot/pgsql/src/backend/parser/analyze.c,v retrieving revision 1.340 diff -c -r1.340 analyze.c *** src/backend/parser/analyze.c 14 Jul 2006 14:52:21 - 1.340 --- src/backend/parser/analyze.c 19 Jul 2006 03:53:35 - *** *** 657,667 } else { /* * For INSERT ... VALUES, transform the given list of values to form a ! * targetlist for the INSERT. */ ! qry-targetList = transformTargetList(pstate, stmt-targetList); } /* --- 657,699 } else { + ListCell *tlr; + /* * For INSERT ... VALUES, transform the given list of values to form a ! * targetlist for the INSERT. In a multi-targetlist INSERT, append all ! * but the first targetlist to extras_after to be processed later by ! * do_parse_analyze */ ! qry-targetList = NIL; ! foreach(tlr, stmt-targetList) ! { ! List *tgtlist = (List *) lfirst(tlr); ! ! if (qry-targetList == NIL) ! { ! /* transform the first targetlist */ ! qry-targetList = transformTargetList(pstate, tgtlist); ! } ! else ! { ! /* ! * Create an InsertStmt node for each additional targetlist ! * and append to extras_after ! */ ! InsertStmt *insnode = makeNode(InsertStmt); ! ! insnode-cols = NIL; ! insnode-targetList = list_make1(tgtlist); ! insnode-selectStmt = NULL; ! insnode-relation = stmt-relation; ! ! if (*extras_after == NIL) ! *extras_after = list_make1(insnode); ! else ! *extras_after = lappend(*extras_after, insnode); ! } ! } } /* Index: src/backend/parser/gram.y === RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v retrieving revision 2.551 diff -c -r2.551 gram.y *** src/backend/parser/gram.y 3 Jul 2006 22:45:39 - 2.551 --- src/backend/parser/gram.y 19 Jul 2006 03:53:40 - *** *** 238,247
Re: [PATCHES] [HACKERS] 8.2 features?
Joe Conway wrote: . multiple values clauses for INSERT The best way might be to fabricate a selectStmt equiv to SELECT targetlist UNION ALL SELECT targetlist..., but that still feels like a hack. Here is a patch pursuant to my earlier post. It has the advantage of being fairly simple and noninvasive. The major downside is that somewhere between 9000 and 1 VALUES-targetlists produces ERROR: stack depth limit exceeded. Perhaps for the typical use-case this is sufficient though. I'm open to better ideas, comments, objections... Thanks, Joe Index: src/backend/parser/gram.y === RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v retrieving revision 2.551 diff -c -r2.551 gram.y *** src/backend/parser/gram.y 3 Jul 2006 22:45:39 - 2.551 --- src/backend/parser/gram.y 18 Jul 2006 04:19:45 - *** *** 238,251 qualified_name_list any_name any_name_list any_operator expr_list attrs target_list update_target_list insert_column_list ! insert_target_list def_list indirection opt_indirection ! group_clause TriggerFuncArgs select_limit ! opt_select_limit opclass_item_list ! transaction_mode_list_or_empty TableFuncElementList prep_type_clause prep_type_list execute_param_clause using_clause %type range into_clause OptTempTableName %type defelt createfunc_opt_item common_func_opt_item --- 238,253 qualified_name_list any_name any_name_list any_operator expr_list attrs target_list update_target_list insert_column_list ! insert_target_els ! def_list indirection opt_indirection group_clause ! TriggerFuncArgs select_limit opt_select_limit ! opclass_item_list transaction_mode_list_or_empty TableFuncElementList prep_type_clause prep_type_list execute_param_clause using_clause + %type node insert_target_list insert_target_lists + %type range into_clause OptTempTableName %type defelt createfunc_opt_item common_func_opt_item *** *** 5349,5360 ; insert_rest: ! VALUES '(' insert_target_list ')' { $$ = makeNode(InsertStmt); $$-cols = NIL; ! $$-targetList = $3; ! $$-selectStmt = NULL; } | DEFAULT VALUES { --- 5351,5370 ; insert_rest: ! VALUES insert_target_lists { $$ = makeNode(InsertStmt); $$-cols = NIL; ! if (((SelectStmt *) $2)-op == SETOP_UNION) ! { ! $$-targetList = NIL; ! $$-selectStmt = $2; ! } ! else ! { ! $$-targetList = ((SelectStmt *) $2)-targetList; ! $$-selectStmt = NULL; ! } } | DEFAULT VALUES { *** *** 5370,5381 $$-targetList = NIL; $$-selectStmt = $1; } ! | '(' insert_column_list ')' VALUES '(' insert_target_list ')' { $$ = makeNode(InsertStmt); $$-cols = $2; ! $$-targetList = $6; ! $$-selectStmt = NULL; } | '(' insert_column_list ')' SelectStmt { --- 5380,5399 $$-targetList = NIL; $$-selectStmt = $1; } ! | '(' insert_column_list ')' VALUES insert_target_lists { $$ = makeNode(InsertStmt); $$-cols = $2; ! if (((SelectStmt *) $5)-op == SETOP_UNION) ! { ! $$-targetList = NIL; ! $$-selectStmt = $5; ! } ! else ! { ! $$-targetList = ((SelectStmt *) $5)-targetList; ! $$-selectStmt = NULL; ! } } | '(' insert_column_list ')' SelectStmt { *** *** 8189,8197 ; insert_target_list: ! insert_target_el { $$ = list_make1($1); } ! | insert_target_list ',' insert_target_el { $$ = lappend($1, $3); } ; insert_target_el: --- 8207,8235 ; + insert_target_lists: + insert_target_list + { + $$ = $1; + } + | insert_target_lists ',' insert_target_list + { + $$ = makeSetOp(SETOP_UNION, TRUE, $1, $3); + } + ; + insert_target_list: ! '(' insert_target_els ')' ! { ! SelectStmt *n = makeNode(SelectStmt); ! n-targetList = $2; ! $$ = (Node *) n; ! } ! ; ! ! insert_target_els: ! insert_target_el { $$ = list_make1($1); } ! | insert_target_els ',' insert_target_el { $$ = lappend($1, $3); } ; insert_target_el: ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org