Re: [HACKERS] COPY and Volatile default expressions
On 16 April 2013 14:37, Simon Riggs wrote: > On 16 April 2013 13:57, Heikki Linnakangas wrote: > >> You still need to check the args, if the function is nextval, otherwise you >> incorrectly perform the optimization for something like >> "nextval(myvolatilefunc())". > > Guess so. At least its an easy change. > > Thanks for checking. Rebased v3 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services copy_tuning_with_default_nextval.v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 24 June 2013 10:21, Kohei KaiGai wrote: > Hi Simon, > > I checked this patch. One thing I could comment on is, do you think it is > a good > idea to have oid of exception function list on > contain_volatile_functions_walker()? > > The walker function is static thus here is no impact for other caller, and > its > "context" argument is unused. > My proposition is to enhance 2nd argument of > contain_volatile_functions_walker() > to deliver list of exceptional functions, then > contain_volatile_functions_not_nextval() > calls contain_volatile_functions_walker() with > list_make1_oid(F_NEXTVAL_OID) to > handle nextval() as exception. > Otherwise, all we need to do is put NIL as 2nd argument. > > It kills code duplication and reduces future maintenance burden. > How about your opinion? > That approach is more flexible than the one in the patch, I agree. Ultimately, I see this as a choice between a special purpose piece of code (as originally supplied) and a much more generic facility for labelling functions as to whether they contain SQL or not, per the SQL standard as Jaime suggests. There's not much mileage in something in between. So I'm mid way through updating the patch to implement the generic facility. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] COPY and Volatile default expressions
Hi Simon, I checked this patch. One thing I could comment on is, do you think it is a good idea to have oid of exception function list on contain_volatile_functions_walker()? The walker function is static thus here is no impact for other caller, and its "context" argument is unused. My proposition is to enhance 2nd argument of contain_volatile_functions_walker() to deliver list of exceptional functions, then contain_volatile_functions_not_nextval() calls contain_volatile_functions_walker() with list_make1_oid(F_NEXTVAL_OID) to handle nextval() as exception. Otherwise, all we need to do is put NIL as 2nd argument. It kills code duplication and reduces future maintenance burden. How about your opinion? Thanks, 2013/4/16 Simon Riggs : > On 16 April 2013 13:57, Heikki Linnakangas wrote: > >> You still need to check the args, if the function is nextval, otherwise you >> incorrectly perform the optimization for something like >> "nextval(myvolatilefunc())". > > Guess so. At least its an easy change. > > Thanks for checking. > > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 16 April 2013 15:07, Tom Lane wrote: > Bruce Momjian writes: >> I found Simon's nextval()/COPY timings without this patch sobering. I >> assume he can apply this for 9.3, right? I believe it is a fix for a >> new 9.3 feature. > > It is not a "fix", it is not for a 9.3 feature (the multi-insert thing > went in in 9.2), and personally I'd vote against applying it now, > especially if Simon is expecting anybody to review it. I wrote this expecting it to be a 9.4 feature. I'm good with that. There are other somewhat related optimisations also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Tue, Apr 16, 2013 at 10:07:07AM -0400, Tom Lane wrote: > Bruce Momjian writes: > > I found Simon's nextval()/COPY timings without this patch sobering. I > > assume he can apply this for 9.3, right? I believe it is a fix for a > > new 9.3 feature. > > It is not a "fix", it is not for a 9.3 feature (the multi-insert thing > went in in 9.2), and personally I'd vote against applying it now, > especially if Simon is expecting anybody to review it. Oh, I thought multi-insert was 9.3. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
Bruce Momjian writes: > I found Simon's nextval()/COPY timings without this patch sobering. I > assume he can apply this for 9.3, right? I believe it is a fix for a > new 9.3 feature. It is not a "fix", it is not for a 9.3 feature (the multi-insert thing went in in 9.2), and personally I'd vote against applying it now, especially if Simon is expecting anybody to review it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Tue, Apr 16, 2013 at 02:37:33PM +0100, Simon Riggs wrote: > On 16 April 2013 13:57, Heikki Linnakangas wrote: > > > You still need to check the args, if the function is nextval, otherwise you > > incorrectly perform the optimization for something like > > "nextval(myvolatilefunc())". > > Guess so. At least its an easy change. I found Simon's nextval()/COPY timings without this patch sobering. I assume he can apply this for 9.3, right? I believe it is a fix for a new 9.3 feature. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 16 April 2013 13:57, Heikki Linnakangas wrote: > You still need to check the args, if the function is nextval, otherwise you > incorrectly perform the optimization for something like > "nextval(myvolatilefunc())". Guess so. At least its an easy change. Thanks for checking. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services copy_tuning_with_default_nextval.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 16.04.2013 14:38, Simon Riggs wrote: On 15 April 2013 21:53, Simon Riggs wrote: So I'll treat this as two separate cases: * add special case for sequences Patch attached. + if (IsA(node, FuncExpr)) + { + FuncExpr *expr = (FuncExpr *) node; + + /* +* For this case only, we want to ignore the volatility of the +* nextval() function, since some callers want this. nextval() +* has no other args other than sequence name, so we can just +* return false immediately in this case. +*/ + if (expr->funcid == F_NEXTVAL_OID) + return false; + else if (func_volatile(expr->funcid) == PROVOLATILE_VOLATILE) + return true; + /* else fall through to check args */ + } ... You still need to check the args, if the function is nextval, otherwise you incorrectly perform the optimization for something like "nextval(myvolatilefunc())". - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 21:53, Simon Riggs wrote: > So I'll treat this as two separate cases: > * add special case for sequences Patch attached. > * use the NO SQL mechanism, as described, which implies no reads or > writes of database state. We could test that, but its somewhat harder > and we'd need to test for that on entry to any function, which I don't > much like. For another time. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services copy_tuning_with_default_nextval.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 21:32, Jaime Casanova wrote: > On Mon, Apr 15, 2013 at 3:21 PM, Tom Lane wrote: >> >> OTOH, the notion that a UUID generator doesn't touch *any* database >> state seems like it might be worth treating as a general function >> property: it's simple to understand and applies to a lot of other >> volatile functions such as random() and clock_timestamp(). >> > > Something like the NO SQL access indication mandated by sql standard? > > http://www.postgresql.org/message-id/1267473390.7837.9.ca...@vanquo.pezone.net That would work for UUIDs, random() etc but not for sequences. So I'll treat this as two separate cases: * add special case for sequences * use the NO SQL mechanism, as described, which implies no reads or writes of database state. We could test that, but its somewhat harder and we'd need to test for that on entry to any function, which I don't much like. Default to current timestamp is also a common use case - thanks for mentioning that. Doing it tha way Tatsuo would be able to parse functions more easily as requested in the linked post. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 4:21 PM, Tom Lane wrote: > I think plenty of people would be upset if row serial numbers assigned > with nextval() were not assigned in the order of the incoming rows. > The argument that you can get gaps in the sequence in some corner cases > (none of which apply within a single COPY operation, btw) doesn't > entitle us to violate the POLA to that extent. I agree. > After looking again at the code Simon is concerned about, though, > whether we are willing to allow volatile function calls to be reordered > has approximately nothing to do with this COPY optimization. Rather, > the thing that makes it safe is that nextval() isn't going to look at > the COPY target table, and thus whether or not the previous rows have > been physically inserted isn't important. The same goes for the UUID > example. So I think he's done himself a disservice by talking about > reordering and bringing up the question of parallel queries. What we > ought to be thinking about is how we can be certain that a function call > isn't going to look at the uncommitted table rows; no more and no less. Yep. > In this context, I think we could do a lot worse than to special-case > nextval(), because it's hard to see a really principled function > attribute definition that would admit it here. It does look at, and > even modify, uncommitted database state. We know it's safe because a > sequence relation couldn't be the target of COPY ... but that reasoning > doesn't fit nicely into anything I think we'd want to expose to users. > > OTOH, the notion that a UUID generator doesn't touch *any* database > state seems like it might be worth treating as a general function > property: it's simple to understand and applies to a lot of other > volatile functions such as random() and clock_timestamp(). I think that's right; and I also think that's something that could be useful in other contexts. It also has the advantage of being a *checkable* property. That is, if a function is marked as changing no database state, we can set a flag on entry into that context and clear the flag on exit. If in the middle any attempt is made to change the database state, then we can throw an error. This would be pretty good insurance against the possibility that future optimizations based around that flag would cause behavioral differences. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 20:52, Robert Haas wrote: > On Mon, Apr 15, 2013 at 11:49 AM, Tom Lane wrote: >>> I claim this is a common class, since sequence next_val functions and >>> uuid generators meet that criteria and most common forms of auditing >>> trigger, as well as any other form of data-reformatting trigger. >> >> I don't believe that it's a good idea to consider nextval() to be >> reorderable, so I'm not convinced by your argument here. > > Why not? > > I admit that I can't convince myself that it's safe. But I can't > think of a concrete example where it breaks anything, either. I'm not sure exactly if the corect label is "reorderable" or some other word that describes this specific situation. "Parallel independent" probably would work, since if we had a DML statement that worked in 2 separate processes we'd probably want to know that the rows in each could be divided up without changing the result. It looks straightforward to put in a special case check for sequences, which is the most important use case. Sequences are a recognised database object, so such a special case could be justified... shame about the other use cases though, e.g. UUIDs. I guess we can generalise it when we have a better idea of whether this does indeed make a useful generalisation. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 3:21 PM, Tom Lane wrote: > > OTOH, the notion that a UUID generator doesn't touch *any* database > state seems like it might be worth treating as a general function > property: it's simple to understand and applies to a lot of other > volatile functions such as random() and clock_timestamp(). > Something like the NO SQL access indication mandated by sql standard? http://www.postgresql.org/message-id/1267473390.7837.9.ca...@vanquo.pezone.net -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitaciĆ³n Phone: +593 4 5107566 Cell: +593 987171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
Robert Haas writes: > On Mon, Apr 15, 2013 at 11:49 AM, Tom Lane wrote: >> I don't believe that it's a good idea to consider nextval() to be >> reorderable, so I'm not convinced by your argument here. > Why not? > I admit that I can't convince myself that it's safe. But I can't > think of a concrete example where it breaks anything, either. I think plenty of people would be upset if row serial numbers assigned with nextval() were not assigned in the order of the incoming rows. The argument that you can get gaps in the sequence in some corner cases (none of which apply within a single COPY operation, btw) doesn't entitle us to violate the POLA to that extent. After looking again at the code Simon is concerned about, though, whether we are willing to allow volatile function calls to be reordered has approximately nothing to do with this COPY optimization. Rather, the thing that makes it safe is that nextval() isn't going to look at the COPY target table, and thus whether or not the previous rows have been physically inserted isn't important. The same goes for the UUID example. So I think he's done himself a disservice by talking about reordering and bringing up the question of parallel queries. What we ought to be thinking about is how we can be certain that a function call isn't going to look at the uncommitted table rows; no more and no less. In this context, I think we could do a lot worse than to special-case nextval(), because it's hard to see a really principled function attribute definition that would admit it here. It does look at, and even modify, uncommitted database state. We know it's safe because a sequence relation couldn't be the target of COPY ... but that reasoning doesn't fit nicely into anything I think we'd want to expose to users. OTOH, the notion that a UUID generator doesn't touch *any* database state seems like it might be worth treating as a general function property: it's simple to understand and applies to a lot of other volatile functions such as random() and clock_timestamp(). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 11:49 AM, Tom Lane wrote: >> I claim this is a common class, since sequence next_val functions and >> uuid generators meet that criteria and most common forms of auditing >> trigger, as well as any other form of data-reformatting trigger. > > I don't believe that it's a good idea to consider nextval() to be > reorderable, so I'm not convinced by your argument here. Why not? I admit that I can't convince myself that it's safe. But I can't think of a concrete example where it breaks anything, either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 07:04:55PM +0100, Simon Riggs wrote: > On 15 April 2013 18:41, David Fetter wrote: > > > The difference between HEAD and patch in the "COPY, with sequence" > > case is pretty remarkable. What's the patch? > > Attached. Thanks! :) > This is usable only for this test. It is not anywhere remotely close > to being applied. Of course. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 18:41, David Fetter wrote: > The difference between HEAD and patch in the "COPY, with sequence" > case is pretty remarkable. What's the patch? Attached. This is usable only for this test. It is not anywhere remotely close to being applied. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services copy_test_kluge.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 06:30:55PM +0100, Simon Riggs wrote: > On 15 April 2013 17:04, Simon Riggs wrote: > > > I will implement as a kluge, test and report the results. > > Test is COPY 1 million rows on a table with 2 columns, both bigint. > Verified no checkpoints triggered during load. > No other work active on database, tests condicted on laptop > Autovacuum disabled. > Results from multiple runs, outliers excluded, rough averages > > HEAD > COPY, with sequence ~5500ms > COPY, with sequence, cached ~5000ms > COPY, no sequence ~1600ms > > PATCH to allow sequences to use multi-insert optimisation (1 line change) > COPY, with sequence ~1850ms > COPY, with sequence, cached ~1750ms > COPY, no sequence ~1600ms > > This shows that > * cacheing the sequence gives a useful improvement currently > * use of multi-insert optimisaton is very important > > Proposals > * set CACHE 100 on automatically created SERIAL sequences > * allow some way to use multi-insert optimisation when default expr is > next_val on a sequence > > Tests performed without indexes since this is another area of known > performance issues that I hope to cover later. Zero indexes is not > real, but we're trying to measure the effect and benefit of an > isolated change, so in this case it is appropriate. The difference between HEAD and patch in the "COPY, with sequence" case is pretty remarkable. What's the patch? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 17:04, Simon Riggs wrote: > I will implement as a kluge, test and report the results. Test is COPY 1 million rows on a table with 2 columns, both bigint. Verified no checkpoints triggered during load. No other work active on database, tests condicted on laptop Autovacuum disabled. Results from multiple runs, outliers excluded, rough averages HEAD COPY, with sequence ~5500ms COPY, with sequence, cached ~5000ms COPY, no sequence ~1600ms PATCH to allow sequences to use multi-insert optimisation (1 line change) COPY, with sequence ~1850ms COPY, with sequence, cached ~1750ms COPY, no sequence ~1600ms This shows that * cacheing the sequence gives a useful improvement currently * use of multi-insert optimisaton is very important Proposals * set CACHE 100 on automatically created SERIAL sequences * allow some way to use multi-insert optimisation when default expr is next_val on a sequence Tests performed without indexes since this is another area of known performance issues that I hope to cover later. Zero indexes is not real, but we're trying to measure the effect and benefit of an isolated change, so in this case it is appropriate. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
David Fetter writes: > On Mon, Apr 15, 2013 at 05:04:16PM +0100, Simon Riggs wrote: >> Loading data into a table with a SERIAL or UUID column is the main >> use case, so I'll measure that. > The former is common enough a use case to optimize specifically, > should the numbers come out right. Yeah. TBH I would rather see a special-case hack in the COPY code to accept nextval() than expose anything as dirty and special-purpose as this proposed flag to users. But in any case, I don't believe that adequate evidence has been offered to show that we should do anything at all here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 04/15/2013 06:04 PM, Simon Riggs wrote: On 15 April 2013 16:55, Tom Lane wrote: Simon Riggs writes: On 15 April 2013 16:24, David Fetter wrote: Do you have numbers on this, or ways to gather same? In other words, how do we know what resources (time, CPU cycles, disk seeks, etc.) are being consumed here? The multi-insert optimisation for COPY is already there and works well enough to have been committed. You seem to not have answered the question. Exactly what sort of performance gain might be possible, bearing in mind that anything that invokes a trigger (for instance) is unlikely to be amazingly fast anyway? Forgive me, I assumed the list would be familiar with the optimization and so be excited by the need for this. I will implement as a kluge, test and report the results. Would just declaring nextval() to be a stable function be a good test ? Hannu Loading data into a table with a SERIAL or UUID column is the main use case, so I'll measure that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 17:08, David Fetter wrote: >> Loading data into a table with a SERIAL or UUID column is the main >> use case, so I'll measure that. > > The former is common enough a use case to optimize specifically, > should the numbers come out right. Do you suppose that an in-core > UUID generator would help the latter make more sense as a part of the > same use case? Only if some form of labelling becomes an issue, but I'm ever hopeful. Let's wait for the test results now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 05:04:16PM +0100, Simon Riggs wrote: > On 15 April 2013 16:55, Tom Lane wrote: > > Simon Riggs writes: > >> On 15 April 2013 16:24, David Fetter wrote: > >>> Do you have numbers on this, or ways to gather same? In other > >>> words, how do we know what resources (time, CPU cycles, disk > >>> seeks, etc.) are being consumed here? > > > >> The multi-insert optimisation for COPY is already there and works > >> well enough to have been committed. > > > > You seem to not have answered the question. Exactly what sort of > > performance gain might be possible, bearing in mind that anything > > that invokes a trigger (for instance) is unlikely to be amazingly > > fast anyway? > > Forgive me, I assumed the list would be familiar with the > optimization and so be excited by the need for this. > > I will implement as a kluge, test and report the results. > > Loading data into a table with a SERIAL or UUID column is the main > use case, so I'll measure that. The former is common enough a use case to optimize specifically, should the numbers come out right. Do you suppose that an in-core UUID generator would help the latter make more sense as a part of the same use case? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 11:49:42AM -0400, Tom Lane wrote: > Simon Riggs writes: > > COPY cannot be optimised correctly if we have before triggers or > > volatile default expressions. > > > The multi-insert code detects those cases and falls back to the single > > row mechanism in those cases. > > > There a common class of volatile functions that wouldn't cause > > problems: any volatile function that doesn't touch the table being > > loaded and still works correctly when called with alternately ordered > > data. > > > I claim this is a common class, since sequence next_val functions and > > uuid generators meet that criteria and most common forms of auditing > > trigger, as well as any other form of data-reformatting trigger. > > I don't believe that it's a good idea to consider nextval() to be > reorderable, so I'm not convinced by your argument here. We tell people very clearly in the docs and elsewhere that nextval() guarantees uniqueness and very specifically not ordering, so with greatest respect, I agree with Simon on its reorderability. > > What I'd like to do is to invent a new form of labelling that > > allows us to understand that COPY can still be optimised. > > And I don't want to invent impossible-to-verify function attributes > with such a tiny use-case as this. Are you referring to the Halting Problem? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 16:55, Tom Lane wrote: > Simon Riggs writes: >> On 15 April 2013 16:24, David Fetter wrote: >>> Do you have numbers on this, or ways to gather same? In other words, >>> how do we know what resources (time, CPU cycles, disk seeks, etc.) are >>> being consumed here? > >> The multi-insert optimisation for COPY is already there and works well >> enough to have been committed. > > You seem to not have answered the question. Exactly what sort of > performance gain might be possible, bearing in mind that anything that > invokes a trigger (for instance) is unlikely to be amazingly fast > anyway? Forgive me, I assumed the list would be familiar with the optimization and so be excited by the need for this. I will implement as a kluge, test and report the results. Loading data into a table with a SERIAL or UUID column is the main use case, so I'll measure that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
Simon Riggs writes: > On 15 April 2013 16:24, David Fetter wrote: >> Do you have numbers on this, or ways to gather same? In other words, >> how do we know what resources (time, CPU cycles, disk seeks, etc.) are >> being consumed here? > The multi-insert optimisation for COPY is already there and works well > enough to have been committed. You seem to not have answered the question. Exactly what sort of performance gain might be possible, bearing in mind that anything that invokes a trigger (for instance) is unlikely to be amazingly fast anyway? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 16:41, Heikki Linnakangas wrote: >> What I'd like to do is to invent a new form of labelling that allows >> us to understand that COPY can still be optimised. > > It would be even nicer to detect at runtime, when a default expression or > before trigger tries to access the same table. When that happens, we could > immediately flush all the tuples buffered that far to disk, so that they are > visible to the expression, and then proceed with it. Maybe, but do we really want an extra test every time we access a table? And if we did that, how would we pass control back to the COPY command to allow it flush the buffer before continuing with the function? How would we cope with times when a subtle problem makes a function unusable, even though it passes automatic detection? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
Simon Riggs writes: > COPY cannot be optimised correctly if we have before triggers or > volatile default expressions. > The multi-insert code detects those cases and falls back to the single > row mechanism in those cases. > There a common class of volatile functions that wouldn't cause > problems: any volatile function that doesn't touch the table being > loaded and still works correctly when called with alternately ordered > data. > I claim this is a common class, since sequence next_val functions and > uuid generators meet that criteria and most common forms of auditing > trigger, as well as any other form of data-reformatting trigger. I don't believe that it's a good idea to consider nextval() to be reorderable, so I'm not convinced by your argument here. > What I'd like to do is to invent a new form of labelling that allows > us to understand that COPY can still be optimised. And I don't want to invent impossible-to-verify function attributes with such a tiny use-case as this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15 April 2013 16:24, David Fetter wrote: >> I claim this is a common class, since sequence next_val functions and >> uuid generators meet that criteria and most common forms of auditing >> trigger, as well as any other form of data-reformatting trigger. Since >> this is a common case, it seems worth optimising. > > Do you have numbers on this, or ways to gather same? In other words, > how do we know what resources (time, CPU cycles, disk seeks, etc.) are > being consumed here? The multi-insert optimisation for COPY is already there and works well enough to have been committed. All we have to do to allow it to be used is to persuade COPY that come kinds of volatile function need not prevent the optimisation. So once we have a mechanism for appropriately labelling a function, it will be a one-line change in copy.c to enable the optimisation. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On 15.04.2013 17:00, Simon Riggs wrote: COPY cannot be optimised correctly if we have before triggers or volatile default expressions. The multi-insert code detects those cases and falls back to the single row mechanism in those cases. There a common class of volatile functions that wouldn't cause problems: any volatile function that doesn't touch the table being loaded and still works correctly when called with alternately ordered data. I claim this is a common class, since sequence next_val functions and uuid generators meet that criteria and most common forms of auditing trigger, as well as any other form of data-reformatting trigger. Since this is a common case, it seems worth optimising. What I'd like to do is to invent a new form of labelling that allows us to understand that COPY can still be optimised. It would be even nicer to detect at runtime, when a default expression or before trigger tries to access the same table. When that happens, we could immediately flush all the tuples buffered that far to disk, so that they are visible to the expression, and then proceed with it. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY and Volatile default expressions
On Mon, Apr 15, 2013 at 03:00:34PM +0100, Simon Riggs wrote: > COPY cannot be optimised correctly if we have before triggers or > volatile default expressions. > > The multi-insert code detects those cases and falls back to the > single row mechanism in those cases. > > There a common class of volatile functions that wouldn't cause > problems: any volatile function that doesn't touch the table being > loaded and still works correctly when called with alternately > ordered data. "Doesn't touch already-existing rows?" Makes a lot of sense :) > I claim this is a common class, since sequence next_val functions and > uuid generators meet that criteria and most common forms of auditing > trigger, as well as any other form of data-reformatting trigger. Since > this is a common case, it seems worth optimising. Do you have numbers on this, or ways to gather same? In other words, how do we know what resources (time, CPU cycles, disk seeks, etc.) are being consumed here? > What I'd like to do is to invent a new form of labelling that allows > us to understand that COPY can still be optimised. I'm thinking to add > a new function label, something like one of > * IDEMPOTENT > * ORDER INDEPENDENT > * BATCHABLE > * NON SELF REFERENCING > * GO FASTER DAMMIT > etc > > I'm sure many people will have a much more exact description and a > better name than I do. > > This becomes more important when we think about parallelising SQL, > since essentially the same problem exists with parallel SQL calling > volatile functions. Oracle handles that by having a pragma to allow a > function to be declared as parallel safe. What happens when you misinform Oracle about this? Does it attempt to check? More importantly, what *should* happen? > I was also thinking that the best way to do this would be to invent a > new flexible function labelling scheme, so use something like hstore > to store a list of function attributes. Something that would mean we > don't have to invent new keywords every time we have a new function > label. > > Suggestions please. JSON's in core. How about using that? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] COPY and Volatile default expressions
COPY cannot be optimised correctly if we have before triggers or volatile default expressions. The multi-insert code detects those cases and falls back to the single row mechanism in those cases. There a common class of volatile functions that wouldn't cause problems: any volatile function that doesn't touch the table being loaded and still works correctly when called with alternately ordered data. I claim this is a common class, since sequence next_val functions and uuid generators meet that criteria and most common forms of auditing trigger, as well as any other form of data-reformatting trigger. Since this is a common case, it seems worth optimising. What I'd like to do is to invent a new form of labelling that allows us to understand that COPY can still be optimised. I'm thinking to add a new function label, something like one of * IDEMPOTENT * ORDER INDEPENDENT * BATCHABLE * NON SELF REFERENCING * GO FASTER DAMMIT etc I'm sure many people will have a much more exact description and a better name than I do. This becomes more important when we think about parallelising SQL, since essentially the same problem exists with parallel SQL calling volatile functions. Oracle handles that by having a pragma to allow a function to be declared as parallel safe. I was also thinking that the best way to do this would be to invent a new flexible function labelling scheme, so use something like hstore to store a list of function attributes. Something that would mean we don't have to invent new keywords every time we have a new function label. Suggestions please. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers