Re: [PATCHES] posix advises ...
Greg Smith írta: On Thu, 19 Jun 2008, Zoltan Boszormenyi wrote: This patch (revisited and ported to current CVS HEAD) is indeed using Greg's original patch and also added another patch written by Mark Wong that helps evicting closed XLOGs from memory faster. Great, that will save me some trouble. I've got a stack of Linux performance testing queued up (got stuck behind a kernel bug impacting pgbench) for the next couple of weeks and I'll include this in that testing. I think I've got a similar class of hardware as you tested on for initial evaluation--I'm getting around 200MB/s sequential I/O right now out of my small RAID setup,. I added your patch to the queue for next month's CommitFest and listed myself as the initial reviewer, but a commit that soon is unlikely. Performance tests like this usually take a while to converge, and since this is using a less popular API I expect a round of portability concerns, too. Where did Marc's patch come from? I'd like to be able to separate out that change from the rest if necessary. That patch was posted here: http://archives.postgresql.org/pgsql-patches/2008-03/msg0.php Also, if you have any specific test cases you ran that I could start by trying to replicate a speedup on, those would be handy as well. We experienced synchronized seqscans slowing down after some (10+) clients which seems to be strange as it should be a strong selling point of 8.3. With the posix_fadvise() patchs, the dropoff was pushed further. The query involved multiple tables, it was not a trivial one table seqscan case. Without the patch, both a simpler SATA system (each disk at ~63MB/sec) and a hw RAID with 400+ MB/sec showed slowdown. The initial 60-63MB/sec with 1-3 clients on the single SATA disk system quickly dropped to 11-17MB/sec with more clients. With the patch, it only dropped to 40-47MB/sec. I cannot recall the RAID figures. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] posix advises ...
Greg Smith írta: On Sun, 11 May 2008, Hans-Juergen Schoenig wrote: we also made some simple autoconf hack to check for broken posix_fadvise. Because of how you did that, your patch is extremely difficult to even test. You really should at least scan the output from diff you're about to send before submitting a patch to make sure it makes sense--yours is over 30,000 lines long just to implement a small improvement. The reason for that is that you regenerated configure using a later version of Autoconf than the official distribution, and the diff for the result is gigantic. You even turned off the check in configure.in that specifically prevents you from making that mistake so it's not like you weren't warned. Sorry, that old autoconf version was nowhere to be found for Fedora 8. Fortunately PostgreSQL 8.4 switched already to autoconf 2.61... :-) You should just show the necessary modifications to configure.in in your patch, certainly shouldn't submit a patch that subverts the checks there, and leave out the resulting configure file if you didn't use the same version of Autoconf. I find the concept behind this patch very useful and I'd like to see a useful one re-submitted. I'm in the middle of setting up some new hardware this month and was planning to test the existing fadvise patches Greg Stark sent out as part of that. This patch (revisited and ported to current CVS HEAD) is indeed using Greg's original patch and also added another patch written by Mark Wong that helps evicting closed XLOGs from memory faster. Our additions are: - advise POSIX_FADV_SEQUENTIAL for seqscans - configure check - small documentation for Greg's patch and its GUC - adapt ginget.c that started using tbm_iterate() recently The configure check implicitely assumes segfaults (which are returned as exit code 139 here) can be handled. IIRC Tom Lane talked about unmatched glibc and Linux kernel versions may segfault when posix_fadvise() was called. (kernel lacked the feature, glibc erroneously assumed it can use it) Having another one to test for accelerating multiple sequential scans would be extremely helpful to add to that, because then I think I can reuse some of the test cases Jeff Davis put together for the 8.3 improvements in that area to see how well it works. It wasn't as clear to me how to test the existing bitmap scan patch, yours seems a more straightforward patch to use as a testing ground for fadvise effectiveness. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ preread-seq-tunable-8.4-v4.diff.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] WITH RECURSIVE patch V0.1
Gregory Stark írta: This is indeed really cool. I'm sorry I haven't gotten to doing what I promised in this area but I'm glad it's happening anyways. Zoltan Boszormenyi [EMAIL PROTECTED] writes: Can we get the rows in tree order, please? ... After all, I didn't specify any ORDER BY clauses in the base, recursive or the final queries. The standard has a clause to specify depth-first order. However doing a depth-first traversal would necessitate quite a different looking plan and it's far less obvious (to me anyways) how to do it. That would be even cooler to have it implemented as well. Also, it seems there are no infinite recursion detection: # with recursive x(level, parent, child) as ( select 1::integer, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.child = x.child ) select * from x; ... it waits and waits and waits ... Well, psql might wait and wait but it's actually receiving rows. A cleverer client should be able to deal with infinite streams of records. I think it's the other way around. The server should not emit infinite number of records. I think DB2 does produce a warning if there is no clause it can determine will bound the results. But that's not actually reliable. It's quite possible to have clauses which will limit the output but not in a way the database can determine. Consider for example a tree-traversal for a binary tree stored in a recursive table reference. The DBA might know that the data contains no loops but the database doesn't. Well, a maintenance resjunk could be used like the branch column in tablefunc::connectby(). -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] WITH RECURSIVE patch V0.1
Yoshiyuki Asaba írta: Hi, From: Zoltan Boszormenyi [EMAIL PROTECTED] Subject: Re: [PATCHES] WITH RECURSIVE patch V0.1 Date: Mon, 19 May 2008 08:19:17 +0200 Also, it seems there are no infinite recursion detection: # with recursive x(level, parent, child) as ( select 1::integer, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.child = x.child ) select * from x; ... it waits and waits and waits ... Well, psql might wait and wait but it's actually receiving rows. A cleverer client should be able to deal with infinite streams of records. I think it's the other way around. The server should not emit infinite number of records. How about adding new GUC parameter max_recursive_call? Yes, why not? MSSQL has a similar MAXRECURSION hint for WITH RECURSIVE queries according to their docs. http://msdn.microsoft.com/en-us/library/ms186243.aspx Regards, -- Yoshiyuki Asaba [EMAIL PROTECTED] -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] WITH RECURSIVE patch V0.1
Martijn van Oosterhout írta: On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote: The standard has a clause to specify depth-first order. However doing a depth-first traversal would necessitate quite a different looking plan and it's far less obvious (to me anyways) how to do it. That would be even cooler to have it implemented as well. From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. However, just looking at the plan I don't know whether it could support that kind of usage. At the very least I don't think the standard tuplestore code can handle it. Well, psql might wait and wait but it's actually receiving rows. A cleverer client should be able to deal with infinite streams of records. I think it's the other way around. The server should not emit infinite number of records. The server won't, the universe will end first. The universe is alive and well, thank you. :-) But the server won't emit infinite number of records, you are right. Given the implementation uses a tuplestore and not producing the tupleslots on the fly, it will go OOM first not the psql client, I watched them in 'top'. It just takes a bit of time. This is a nice example of the halting problem: http://en.wikipedia.org/wiki/Halting_problem Which was proved unsolvable a long time ago. Hmpf, yes, I forgot too much about Turing-machines since university. :-( Have a nice day, -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] WITH RECURSIVE patch V0.1
Martijn van Oosterhout írta: On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote: The standard has a clause to specify depth-first order. However doing a depth-first traversal would necessitate quite a different looking plan and it's far less obvious (to me anyways) how to do it. That would be even cooler to have it implemented as well. From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. Are you sure? I think a LIFO tuplestore would simply return reversed breadth-first order. Depth-first means for every new record descend into another recursion first then continue with the next record on the right. However, just looking at the plan I don't know whether it could support that kind of usage. At the very least I don't think the standard tuplestore code can handle it. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] WITH RECURSIVE patch V0.1
Martijn van Oosterhout írta: On Mon, May 19, 2008 at 11:56:17AM +0200, Zoltan Boszormenyi wrote: From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. Are you sure? I think a LIFO tuplestore would simply return reversed breadth-first order. Depth-first means for every new record descend into another recursion first then continue with the next record on the right. Say your tree looks like: Root-A, D A-B,C D-E,F LIFO pushes A and D. It then pops A and pushes B and C. B and C have no children and are returned. Then D is popped and E and F pushed. So the returned order is: A,B,C,D,E,F. You could also do B,C,A,E,F,D if you wanted. FIFO pushes A and D. It then pops A and puts B and C at *the end*. It then pops D and pushes E and F at the end. So you get the order A,D,B,C,E,F Hope this helps, Thanks, I didn't consider popping elements off while processing. However, if the toplevel query returns tuples in A, D order, you need a positioned insert into the tuplestore, because the LIFO would pop D first. Say, a treestore would work this way: 1. setup: treestore is empty, storage_position := 0 2. treestore_puttupleslot() adds slot at current position, storage_position++ 3. treestore_gettupleslot() removes slot from the beginning, storage_position := 0 This works easily in memory lists but it's not obvious for me how it may work with disk backed temporary storage inside PG. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] WITH RECURSIVE patch V0.1
Gregory Stark írta: Martijn van Oosterhout [EMAIL PROTECTED] writes: From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. I think it's not so simple. How do you reconcile that concept with the join plans like merge join or hash join which expect you to be able to be able to process the records in a specific order? It sounds like you might have to keep around a stack of started executor nodes or something but hopefully we can avoid anything like that because, well, ick. If I understand the code right, the recursion from level N to level N+1 goes like this: collect all records from level N and JOIN it with the recursive query. This way we get all level 1 records from the base query, then all records at the second level, etc. This is how it gets breadth-first ordering. Depth-first ordering could go like this: get only 1 from the current level then go into recursion. Repeat until there are no records in the current level. The only difference would be more recursion steps. Instead of one per level, there would be N per level if there are N tuples in the current level. Definitely slower then the current implementation but comparable with the tablefunc.c connectby() code. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] WITH RECURSIVE patch V0.1
David Fetter írta: On Sun, May 18, 2008 at 08:51:29PM +0900, Tatsuo Ishii wrote: WITH RECURSIVE patch V0.1 Here are patches to implement WITH RECURSIVE clause. There are some limitiations and TODO items(see the Current limitations section below). Comments are welcome. 1. Credit These patches were developed by Yoshiyuki Asaba ([EMAIL PROTECTED]) with some discussions with Tatsuo Ishii ([EMAIL PROTECTED]). This is really great! Kudos to all who made this happen :) I tried a bunch of different queries, and so far, only these two haven't worked. Any ideas what I'm doing wrong here? WITH RECURSIVE t(n) AS ( SELECT 1 UNION ALL SELECT n+1 FROM t WHERE n 100 ) SELECT * FROM t; ERROR: cannot extract attribute from empty tuple slot WITH RECURSIVE t(n) AS ( VALUES (1) UNION ALL SELECT n+1 FROM t WHERE n 100 ) SELECT * FROM t; ERROR: cannot extract attribute from empty tuple slot Cheers, David. Here's a test case attached shamelessly stolen from http://www.adp-gmbh.ch/ora/sql/connect_by.html This query (without naming toplevel columns) works: # with recursive x as (select * from test_connect_by where parent is null union all select base.* from test_connect_by as base, x where base.parent = x.child) select * from x; parent | child +--- |38 |26 |18 18 |11 18 | 7 26 |13 26 | 1 26 |12 38 |15 38 |17 38 | 6 17 | 9 17 | 8 15 |10 15 | 5 5 | 2 5 | 3 (17 rows) It even works when I add my level column: # with recursive x(level, parent, child) as (select 1::bigint, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.parent = x.child) select * from x; level | parent | child ---++--- 1 ||38 1 ||26 1 ||18 2 | 18 |11 2 | 18 | 7 2 | 26 |13 2 | 26 | 1 2 | 26 |12 2 | 38 |15 2 | 38 |17 2 | 38 | 6 3 | 17 | 9 3 | 17 | 8 3 | 15 |10 3 | 15 | 5 4 | 5 | 2 4 | 5 | 3 (17 rows) But I have a little problem with the output. If it's not obvious, here is the query tweaked a little below. # with recursive x(level, parent, child) as (select 1::integer, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.parent = x.child) select lpad(' ', 4*level - 1) || child from x; ?column? -- 38 26 18 11 7 13 1 12 15 17 6 9 8 10 5 2 3 (17 rows) Can we get the rows in tree order, please? I.e. something like this: ?column? -- 38 15 10 5 2 3 17 9 8 6 26 13 1 12 18 11 7 (17 rows) After all, I didn't specify any ORDER BY clauses in the base, recursive or the final queries. Also, it seems there are no infinite recursion detection: # with recursive x(level, parent, child) as ( select 1::integer, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.child = x.child ) select * from x; ... it waits and waits and waits ... Also, there's another rough edge: # with recursive x as (select * from test_connect_by where parent is null) select * from x; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ create table test_connect_by ( parent integer, child integer, constraint uq_tcb unique (child) ); insert into test_connect_by values ( 5, 2); insert into test_connect_by values ( 5, 3); insert into test_connect_by values (18,11); insert into test_connect_by values (18, 7); insert into test_connect_by values (17, 9); insert into test_connect_by values (17, 8); insert into test_connect_by values (26,13); insert into test_connect_by values (26, 1); insert into test_connect_by values (26,12); insert into test_connect_by values (15,10); insert into test_connect_by values (15, 5); insert into test_connect_by values (38,15); insert into test_connect_by values (38,17); insert into test_connect_by values (38, 6); insert into test_connect_by values (null, 38); insert into test_connect_by values (null, 26); insert into test_connect_by values
Re: [PATCHES] [HACKERS] TRUNCATE TABLE with IDENTITY
Zoltan Boszormenyi írta: Zoltan Boszormenyi írta: Decibel! írta: On Apr 3, 2008, at 12:52 AM, Zoltan Boszormenyi wrote: Where is the info in the sequence to provide restarting with the _original_ start value? There isn't any. If you want the sequence to start at some magic value, adjust the minimum value. There's the START WITH option for IDENTITY columns and this below is paragraph 8 under General rules of 14.10 truncate table statement in 6WD2_02_Foundation_2007-12.pdf (page 902): 8) If RESTART IDENTITY is specified and the table descriptor of T includes a column descriptor IDCD of an identity column, then: a) Let CN be the column name included in IDCD and let SV be the start value included in IDCD. b) The following alter table statement is effectively executed without further Access Rule checking: ALTER TABLE TN ALTER COLUMN CN RESTART WITH SV This says that the original start value is used, not the minimum value. IDENTITY has the same options as CREATE SEQUENCE. In fact the identity column specification links to 11.63 sequence generator definition when it comes to IDENTITY sequence options. And surprise, surprise, 11.64 alter sequence generator statement now defines ALTER SEQUENCE sn RESTART [WITH newvalue] where omitting the WITH newval part also uses the original start value. Best regards, Zoltán Böszörményi Attached patch implements the extension found in the current SQL200n draft, implementing stored start value and supporting ALTER SEQUENCE seq RESTART; Some error check are also added to prohibit CREATE SEQUENCE ... RESTART ... and ALTER SEQUENCE ... START ... Best regards, Zoltán Böszörményi Updated patch implements TRUNCATE ... RESTART IDENTITY which restarts all owned sequences for the truncated table(s). Regression tests updated, documentation added. pg_dump was also extended to output original[1] START value for creating SEQUENCEs. [1] For 8.3 and below I could only guesstimate it as MINVALUE for ascending and MAXVALUE for descending sequences. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ sql2008-compliant-seq-v2.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] float4/float8/int64 passed by value with tsearchfixup
Gregory Stark írta: Tom Lane [EMAIL PROTECTED] writes: BTW, I trolled the contrib files for other v0 functions taking or returning float4 or float8. I found seg_size (fixed it) and a whole bunch of functions in earthdistance. Those use float8 not float4, so they are not broken by this patch, but that module will have to be v1-ified before we can consider applying the other part of the patch. So are you killing V0 for non-integral types? Because if not we should keep some sacrificial module to the regression tests to use to test for this problem. I don't see any way not to kill v0 for non-integral types if we want to make float4 and float8 pass-by-value. We could leave those pass-by-reference and just make bigint pass-by-value. Just a note: time[stamp[tz]] also depend on either float8 or int64 and they have to be the same pass-by-value or pass-by-reference as their base storage types. There were crashes or regression failures if not. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] float4/float8/int64 passed by value with tsearch fixup
Alvaro Herrera írta: Tom Lane wrote: Specifically, I think what you missed is that on some platforms C functions pass or return float values differently from similar-sized integer or pointer values (typically, the float values get passed in floating-point registers). Argh ... I would have certainly missed that. It'd be less ugly to convert to v1 calling convention. Okay -- I'll change contrib/seg to v1 to greenify back the buildfarm. So I think we really do need to offer that compile-time option. Failing to do so will be tantamount to desupporting v0 functions altogether, which I don't think we're prepared to do. I have asked the Cybertec guys for a patch. Since it's basically a copy of the float8 change, it should be easy to do. Here's the patch (against current CVS) with the required change. Please review, it passed make check with both --enable and --disable-float4-byval. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ 01-pg84-passedbyval-float4-config.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] float4/float8/int64 passed by value with tsearch fixup
Alvaro Herrera írta: Zoltan Boszormenyi wrote: So I think we really do need to offer that compile-time option. Failing to do so will be tantamount to desupporting v0 functions altogether, which I don't think we're prepared to do. I have asked the Cybertec guys for a patch. Since it's basically a copy of the float8 change, it should be easy to do. Here's the patch (against current CVS) with the required change. Please review, it passed make check with both --enable and --disable-float4-byval. Does it pass make installcheck in contrib? I'm worried about It seems to pass, see below. btree_gist in particular. Perhaps the change I introduced in the previous revision needs to be #ifdef'd out? Both --enable and --disable-float4-byval produced only this regression, but it seems to be a tuple order difference. = running regression test queries== test tsearch2 ... FAILED cat tsearch2/regression.diffs: *** ./expected/tsearch2.out Tue Nov 20 05:23:10 2007 --- ./results/tsearch2.out Sat Apr 19 20:48:16 2008 *** *** 1490,1497 w0| 29 | 29 y9| 29 | 29 zm| 29 | 29 - zs| 29 | 29 zy| 29 | 29 ax| 28 | 28 cd| 28 | 28 dj| 28 | 28 --- 1490,1497 w0| 29 | 29 y9| 29 | 29 zm| 29 | 29 zy| 29 | 29 + zs| 29 | 29 ax| 28 | 28 cd| 28 | 28 dj| 28 | 28 == -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] float4/float8/int64 passed by value with tsearch fixup
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Both --enable and --disable-float4-byval produced only this regression, but it seems to be a tuple order difference. That looks suspiciously locale-ish; what locale are you running PG in? regards, tom lane hu_HU.UTF-8 -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] float4/float8/int64 passed by value with tsearch fixup
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: That looks suspiciously locale-ish; what locale are you running PG in? hu_HU.UTF-8 Ah, and I'll bet zs sorts after zy in hu_HU. Yes, zs is a double letter that sorts after z in general un hu_HU. The query is already doing an ORDER BY, so it's not at fault. I think the only thing we could do about this is add a variant expected file with the hu_HU sort ordering. I'd be happy to do that if it were affecting the main regression tests, but not sure it's worth it for contrib/tsearch2 ... thoughts? regards, tom lane -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] TRUNCATE TABLE with IDENTITY
Zoltan Boszormenyi írta: Decibel! írta: On Apr 3, 2008, at 12:52 AM, Zoltan Boszormenyi wrote: Where is the info in the sequence to provide restarting with the _original_ start value? There isn't any. If you want the sequence to start at some magic value, adjust the minimum value. There's the START WITH option for IDENTITY columns and this below is paragraph 8 under General rules of 14.10 truncate table statement in 6WD2_02_Foundation_2007-12.pdf (page 902): 8) If RESTART IDENTITY is specified and the table descriptor of T includes a column descriptor IDCD of an identity column, then: a) Let CN be the column name included in IDCD and let SV be the start value included in IDCD. b) The following alter table statement is effectively executed without further Access Rule checking: ALTER TABLE TN ALTER COLUMN CN RESTART WITH SV This says that the original start value is used, not the minimum value. IDENTITY has the same options as CREATE SEQUENCE. In fact the identity column specification links to 11.63 sequence generator definition when it comes to IDENTITY sequence options. And surprise, surprise, 11.64 alter sequence generator statement now defines ALTER SEQUENCE sn RESTART [WITH newvalue] where omitting the WITH newval part also uses the original start value. Best regards, Zoltán Böszörményi Attached patch implements the extension found in the current SQL200n draft, implementing stored start value and supporting ALTER SEQUENCE seq RESTART; Some error check are also added to prohibit CREATE SEQUENCE ... RESTART ... and ALTER SEQUENCE ... START ... Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ diff -dcrpN pgsql.orig/src/backend/commands/sequence.c pgsql/src/backend/commands/sequence.c *** pgsql.orig/src/backend/commands/sequence.c 2008-01-01 20:45:49.0 +0100 --- pgsql/src/backend/commands/sequence.c 2008-04-08 10:51:27.0 +0200 *** static Relation open_share_lock(SeqTable *** 88,94 static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel); static Form_pg_sequence read_info(SeqTable elm, Relation rel, Buffer *buf); static void init_params(List *options, bool isInit, ! Form_pg_sequence new, List **owned_by); static void do_setval(Oid relid, int64 next, bool iscalled); static void process_owned_by(Relation seqrel, List *owned_by); --- 88,94 static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel); static Form_pg_sequence read_info(SeqTable elm, Relation rel, Buffer *buf); static void init_params(List *options, bool isInit, ! Form_pg_sequence new, Form_pg_sequence old, List **owned_by); static void do_setval(Oid relid, int64 next, bool iscalled); static void process_owned_by(Relation seqrel, List *owned_by); *** DefineSequence(CreateSeqStmt *seq) *** 116,122 NameData name; /* Check and set all option values */ ! init_params(seq-options, true, new, owned_by); /* * Create relation (and fill *null *value) --- 116,122 NameData name; /* Check and set all option values */ ! init_params(seq-options, true, new, NULL, owned_by); /* * Create relation (and fill *null *value) *** DefineSequence(CreateSeqStmt *seq) *** 143,148 --- 143,153 namestrcpy(name, seq-sequence-relname); value[i - 1] = NameGetDatum(name); break; + case SEQ_COL_STARTVAL: + coldef-typename = makeTypeNameFromOid(INT8OID, -1); + coldef-colname = start_value; + value[i - 1] = Int64GetDatumFast(new.start_value); + break; case SEQ_COL_LASTVAL: coldef-typename = makeTypeNameFromOid(INT8OID, -1); coldef-colname = last_value; *** AlterSequence(AlterSeqStmt *stmt) *** 336,342 memcpy(new, seq, sizeof(FormData_pg_sequence)); /* Check and set new values */ ! init_params(stmt-options, false, new, owned_by); /* Clear local cache so that we don't think we have cached numbers */ /* Note that we do not change the currval() state */ --- 341,347 memcpy(new, seq, sizeof(FormData_pg_sequence)); /* Check and set new values */ ! init_params(stmt-options, false, new, seq, owned_by); /* Clear local cache so that we don't think we have cached numbers */ /* Note that we do not change the currval() state */ *** read_info(SeqTable elm, Relation rel, Bu *** 967,973 */ static void init_params(List *options, bool isInit, ! Form_pg_sequence new, List **owned_by) { DefElem*last_value = NULL; DefElem*increment_by = NULL; --- 972,978 */ static void init_params(List *options, bool isInit, ! Form_pg_sequence new, Form_pg_sequence old, List **owned_by) { DefElem*last_value = NULL; DefElem*increment_by = NULL; *** init_params(List *options, bool isInit, *** 995,1003
[PATCHES] float4/float8/int64 passed by value with tsearch fixup
Hi, I tried to split the previous patch up to see where the tsearch regression comes from. So, it turned out that: - float4 conversion is risk free (patch #1) - float8 conversion is okay, too, if coupled with time[stamp[tz]] conversion (patch #2) but with int64 timestamps enabled, the next one is also needed: - int64 conversion (patch #3) is mostly okay but it is the one that's causing the tsearch regression I looked at the tsearch code and found only one thing that can be suspicious, namely: typedef uint64 TSQuerySign; TSQuerySign is handled almost everywhere as an allocated, passed-by-reference value. I converted it with the attached patch (#4) so it uses Int64GetDatum() and DatumGetInt64() functions internally and the regression went away. Please, consider applying all four patches. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ 01-pg84-passedbyval-float4.patch.gz Description: Unix tar archive 02-pg84-passedbyval-float8.patch.gz Description: Unix tar archive 03-pg84-passedbyval-int64.patch.gz Description: Unix tar archive 04-pg84-passedbyval-tsearch.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Re: int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Tom Lane wrote: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Gregory Stark írta: 1) Please don't include configure in your patch. I don't know why it's checked into CVS but it is so that means manually removing it from any patch. It's usually a huge portion of the diff so it's worth removing. Noted. Just for the record: the reason configure is in CVS is to avoid requiring users of CVS checkouts to have autoconf installed. Perhaps we should rethink that, but in any case there's no point in submitting manual diffs to configure (or any other generated file). Best practice is to just remind the committer that the generated file needs to be regenerated. 3) You could get rid of a bunch of #ifndef HAVE_LONG_INT_64 snippets by having a #define like INT64PASSBYVALUE which is defined to be either true or false. OK, this would also make the patch smaller. Is pg_config_manual.h good for this setting? I'd go for having a #define like that, but what is the reason to set it in pg_config_manual.h? Seems like the configure script should set it. Obviously. :-) Thanks. And: Magnus Hagander wrote: Changes to genbki.sh also have to be mirrored in the msvc build scripts (Genbki.pm) in most cases... //Magnus Thanks for the info, I modified this file as well. Please review the change, I am not a Perl expert and I don't have a Windows build environment. New patch is attached. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ pg84-passedbyval-v4.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Re: int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Tom Lane wrote: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Gregory Stark írta: 1) Please don't include configure in your patch. I don't know why it's checked into CVS but it is so that means manually removing it from any patch. It's usually a huge portion of the diff so it's worth removing. Noted. Just for the record: the reason configure is in CVS is to avoid requiring users of CVS checkouts to have autoconf installed. Perhaps we should rethink that, but in any case there's no point in submitting manual diffs to configure (or any other generated file). Best practice is to just remind the committer that the generated file needs to be regenerated. 3) You could get rid of a bunch of #ifndef HAVE_LONG_INT_64 snippets by having a #define like INT64PASSBYVALUE which is defined to be either true or false. OK, this would also make the patch smaller. Is pg_config_manual.h good for this setting? I'd go for having a #define like that, but what is the reason to set it in pg_config_manual.h? Seems like the configure script should set it. Obviously. :-) Thanks. And: Magnus Hagander wrote: Changes to genbki.sh also have to be mirrored in the msvc build scripts (Genbki.pm) in most cases... //Magnus Thanks for the info, I modified this file as well. Please review the change, I am not a Perl expert and I don't have a Windows build environment. New patch is attached. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ pg84-passedbyval-v4.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Re: int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Alvaro Herrera írta: I don't think my $int64passbyval = (?($real64 = 1)t|f); works. Perhaps my $int64passbyval = $real64 ? 't' : 'f'; Thanks. Modified patch attached. Stupid question follows. Now that float4 is passed by value unconditionally, is it worth modifying the *penalty() functions in GIST/TSearch to just use PG_RETURN_FLOAT4()? Or the implicit modify the float4 value at the caller site and return the same pointer I got as 3rd parameter is an internal API set in stone? Modifying them to have only 2 parameters (the 3rd one was an implicit OUT parameter anyway) and omitting the pointer dereference might give a small speedup. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ pg84-passedbyval-v5.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
int8/float8/time/timestamp[tz]/float4 passed by value, was Re: [PATCHES] Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Hi, I got around to (almost) finalize the patch. What it does: - fixes the configure define described in my previous mail - when the value of HAVE_LONG_INT_64 == 1, the following types are passed by value: int8, float8, time, timestamp, timestamptz The time[stamp[tz]] types caused segfaults in the regression test if their attbyval setting was different from int8/float8, so it's really needed. I am not sure there's another type that needs a similat switch, the type regression tests are running fine. - In the HAVE_LONG_INT_64 == 1 case, Int64GetDatum() becomes a casting macro instead of a function, some others become functions. The implementation of DatumGetFloat4()/Float4GetDatum()/etc functions may change into an inline or change internals. - float4 is now unconditionally passed by value - the int8inc(), int2_sum() and int4_sum() used pointers directly from the Datums for performance, that code path is now commented out, the other code path is correct for the AggState and !AggState runs and correct every time and now because of the passbyval nature of int8, the !AggState version is not slower than using the pointer directly. - removed deprecated DatumGetFloat32/Float32GetDatum/etc macros, they aren't used anywhere in the core and contrib source. There is only one regression, in the tsearch tests. Some selects in tsearch now return no records but they don't segfault. I couldn't hunt the bug down yet, not sure I will have time in the near future for that. Comments welcome. Best regards, Zoltán Böszörményi Zoltan Boszormenyi írta: Hi, I am working on this TODO item: * Consider allowing 64-bit integers and floats to be passed by value on 64-bit platforms Also change 32-bit floats (float4) to be passed by value at the same time. For genbki.sh, to correctly determine whether HAVE_LONG_INT_64 is defined, the attached bugfix is needed in the configure.in machinery. This way the pg_config.h actually conforms to the comments for HAVE_LONG_INT_64 and HAVE_LONG_LONG_INT_64. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ pg84-passedbyval.patch.gz Description: Unix tar archive - Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Hi, Gregory Stark írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: - the int8inc(), int2_sum() and int4_sum() used pointers directly from the Datums for performance, that code path is now commented out, the other code path is correct for the AggState and !AggState runs and correct every time and now because of the passbyval nature of int8, the !AggState version is not slower than using the pointer directly. Does this mean count() and sum() are slower on a 32-bit machine? If you mean slower than on a 64-bit machine then yes. If you mean slower than before, then no. I didn't express myself correctly. The original code path is not commented out, it is just conditionally compiled. BTW I found the tsearch bug, it was a missing conversion of float4 in gistproc.c, it was an unfortunate detail that this didn't cause a segfault, it woul have been easier to find. Now there are no failing regression tests. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ pg84-passedbyval-v2.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
[PATCHES] Re: int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Gregory Stark írta: Ok, ignore my previous message. I've read the patch now and that's not an issue. The old code path is not commented out, it's #ifdef'd conditionally on HAVE_LONG_INT_64 is right (well it seems right, it's a bit hard to tell in patch form). A few comments: 1) Please don't include configure in your patch. I don't know why it's checked into CVS but it is so that means manually removing it from any patch. It's usually a huge portion of the diff so it's worth removing. Noted. 2) The genbki.sh change could be a bit tricky for multi-platform builds (ie OSX). I don't really see an alternative so it's just something to note for the folks setting that up (Hi Dave). Actually there is an alternative but I prefer the approach you've taken. The alternative would be to have a special value in the catalogs for 8-bit maybe-pass-by-value data types and handle the check at run-time. Another alternative would be to have initdb fix up these values in C code instead of fixing them directly in the bki scripts. That seems like more hassle than it's worth though and a bigger break with the rest. 3) You could get rid of a bunch of #ifndef HAVE_LONG_INT_64 snippets by having a #define like INT64PASSBYVALUE which is defined to be either true or false. It might start getting confusing having three different defines for the same thing though. But personally I hate having more #ifdefs than necessary, it makes it hard to read the code. OK, this would also make the patch smaller. Is pg_config_manual.h good for this setting? Or which header would you suggest? 4) Your problems with tsearch and timestamp etc raise an interesting problem. We don't need to mark this in pg_control because it's a purely a run-time issue and doesn't affect on-disk storage. However it does affect ABI compatibility with modules. Perhaps it should be added to PG_MODULE_MAGIC_DATA. I am looking into it. Actually, why isn't sizeof(Datum) in there already? Do we have any protection against loading 64-bit compiled modules in a 32-bit server or vice versa? You can't mix 32-bit executables with 64-bit shared libraries, well, without tricks. I don't see any problem here. But generally this is something I've been wanting to do for a while and basically the same approach I would have taken. It seems sound to me. Thanks for commenting and encouragement. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Re: int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Zoltan Boszormenyi írta: Gregory Stark írta: 4) Your problems with tsearch and timestamp etc raise an interesting problem. We don't need to mark this in pg_control because it's a purely a run-time issue and doesn't affect on-disk storage. However it does affect ABI compatibility with modules. Perhaps it should be added to PG_MODULE_MAGIC_DATA. I am looking into it. Do you think it's actually needed? PG_MODULE_MAGIC_DATA contains the server version the external module was compiled for. This patch won't go to older versions, so it's already protected from the unconditional float4 change. And because you can't mix server and libraries with different bitsize, it's protected from the conditional int64, float8, etc. changes. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Zoltan Boszormenyi írta: BTW I found the tsearch bug, it was a missing conversion of float4 in gistproc.c, it was an unfortunate detail that this didn't cause a segfault, it woul have been easier to find. Now there are no failing regression tests. Disregards this patch, the bugfix for tsearch is not real. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Re: int8/float8/time/timestamp[tz]/float4 passed by value, was Re: Fix HAVE_LONG[_LONG]_INT_64 to really define to 1
Gregory Stark írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Zoltan Boszormenyi írta: Gregory Stark írta: 4) Your problems with tsearch and timestamp etc raise an interesting problem. We don't need to mark this in pg_control because it's a purely a run-time issue and doesn't affect on-disk storage. However it does affect ABI compatibility with modules. Perhaps it should be added to PG_MODULE_MAGIC_DATA. I am looking into it. Do you think it's actually needed? PG_MODULE_MAGIC_DATA contains the server version the external module was compiled for. This patch won't go to older versions, so it's already protected from the unconditional float4 change. And because you can't mix server and libraries with different bitsize, it's protected from the conditional int64, float8, etc. changes. Right, it does seem like we're conservative about adding things to PG_MODULE_MAGIC. As long as this is hard coded based on HAVE_LONG_INT_64 then we don't strictly need it. And I don't see much reason to make this something the user can override. If there are modules which use the wrong macros and assume certain other data types are pass-by-reference when they're not then they're going to fail regardless of what version of postgres they're compiled against anyways. So I would say in response to your other query to _not_ use pg_config_manual.h which is intended for variables which the user can override. If you put anything there then we would have to worry about incompatibilities OK, third version, the #define INT64PASSBYVAL is used now for less #ifdef'd code, I used postgres.h for the defines for now. As for the tsearch problem, it seems that bth tsearch and gist in general uses the internal type for passing pointers to different datatypes around for modifying them in-place, float4 among them. So, somewhere tsearch or gist uses hardcoded passed-by-ref where it really a float4, not internal. Someone with more knowledge about tsearch could look into this... -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ pg84-passedbyval-v3.patch.gz Description: Unix tar archive -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
[PATCHES] 64-bit CommandIds
Hi, attached is our patch against HEAD which enables extending CommandIds to 64-bit. This is for enabling long transactions that really do that much non-read-only work in one transaction. The feature is off by default, you need to --enable-huge-commandid. It fails only one regression test (without_oid) that measures the saved space in 8.3. Also, modifying FirstCommandId to be (132ULL - 4) to early overflow the 32-bit limit) doesn't show any real problem besides the combocid regression failure that explicitly lists cmin/cmax values, which is expected. It was written by Zoltán Böszörményi [EMAIL PROTECTED] and Hans-Jürgen Schönig [EMAIL PROTECTED] Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ diff -dcrpN pgsql.orig/configure pgsql-cid64/configure *** pgsql.orig/configure 2008-03-02 13:44:42.0 +0100 --- pgsql-cid64/configure 2008-03-04 16:53:46.0 +0100 *** if test -n $ac_init_help; then *** 1349,1354 --- 1349,1355 Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] + --enable-huge-commandidenable 64-bit CommandId support --enable-integer-datetimes enable 64-bit integer date/time support --enable-nls[=LANGUAGES] enable Native Language Support --disable-shareddo not build shared libraries *** fi *** 2175,2180 --- 2176,2219 # + # 64-bit CommandId + # + echo $as_me:$LINENO: checking whether to build with 64-bit CommandId support 5 + echo $ECHO_N checking whether to build with 64-bit CommandId support... $ECHO_C 6 + + pgac_args=$pgac_args enable_huge_commandid + + # Check whether --enable-huge-commandid or --disable-huge-commandid was given. + if test ${enable_huge_commandid+set} = set; then + enableval=$enable_huge_commandid + + case $enableval in + yes) + + cat confdefs.h \_ACEOF + #define USE_64BIT_COMMANDID 1 + _ACEOF + + ;; + no) + : + ;; + *) + { { echo $as_me:$LINENO: error: no argument expected for --enable-huge-commandid option 5 + echo $as_me: error: no argument expected for --enable-huge-commandid option 2;} +{ (exit 1); exit 1; }; } + ;; + esac + + else + enable_huge_commandid=no + + fi; + + echo $as_me:$LINENO: result: $enable_huge_commandid 5 + echo ${ECHO_T}$enable_huge_commandid 6 + + # # 64-bit integer date/time storage (--enable-integer-datetimes) # { echo $as_me:$LINENO: checking whether to build with 64-bit integer date/time support 5 diff -dcrpN pgsql.orig/configure.in pgsql-cid64/configure.in *** pgsql.orig/configure.in 2008-03-02 13:44:43.0 +0100 --- pgsql-cid64/configure.in 2008-03-04 16:53:46.0 +0100 *** PGAC_ARG_REQ(with, libs, [ --with- *** 128,133 --- 128,142 # + # 64-bit CommandId + # + AC_MSG_CHECKING([whether to build with 64-bit CommandId support]) + PGAC_ARG_BOOL(enable, huge-commandid, no, [ --enable-huge-commandidenable 64-bit CommandId support], + [AC_DEFINE([USE_64BIT_COMMANDID], 1, + [Define to 1 if you want 64-bit CommandId support. (--enable-huge-commandid)])]) + AC_MSG_RESULT([$enable_huge_commandid]) + + # # 64-bit integer date/time storage (--enable-integer-datetimes) # AC_MSG_CHECKING([whether to build with 64-bit integer date/time support]) diff -dcrpN pgsql.orig/doc/src/sgml/installation.sgml pgsql-cid64/doc/src/sgml/installation.sgml *** pgsql.orig/doc/src/sgml/installation.sgml 2008-02-18 13:49:58.0 +0100 --- pgsql-cid64/doc/src/sgml/installation.sgml 2008-03-04 17:16:14.0 +0100 *** su - postgres *** 1011,1016 --- 1011,1027 /varlistentry varlistentry +termoption--enable-huge-commandid/option/term +listitem + para + Use 64-bit CommandIds if you are planning to run transactions + consisting of more than 4 billion commands. This is off by default + to save disk space. + /para +/listitem + /varlistentry + + varlistentry termoption--enable-integer-datetimes/option/term listitem para diff -dcrpN pgsql.orig/src/backend/access/transam/xact.c pgsql-cid64/src/backend/access/transam/xact.c *** pgsql.orig/src/backend/access/transam/xact.c 2008-01-15 19:56:59.0 +0100 --- pgsql-cid64/src/backend/access/transam/xact.c 2008-03-04 16:57:54.0 +0100 *** CommandCounterIncrement(void) *** 592,598 --- 592,602 currentCommandId -= 1; ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + #ifdef USE_64BIT_COMMANDID + errmsg(cannot have more than 2^64-1 commands in a transaction))); + #else errmsg(cannot have more than 2^32-1 commands in a transaction))); + #endif } currentCommandIdUsed = false; diff -dcrpN
Re: [PATCHES] 64-bit CommandIds
Alvaro Herrera írta: Zoltan Boszormenyi wrote: attached is our patch against HEAD which enables extending CommandIds to 64-bit. This is for enabling long transactions that really do that much non-read-only work in one transaction. I think you should add a pg_control field and corresponding check, to avoid a 64bit-Cid postmaster to start on a 32bit-Cid data area and vice versa. You're right, thanks. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your Subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-patches
Re: [PATCHES] 64-bit CommandIds
Alvaro Herrera írta: Zoltan Boszormenyi wrote: attached is our patch against HEAD which enables extending CommandIds to 64-bit. This is for enabling long transactions that really do that much non-read-only work in one transaction. I think you should add a pg_control field and corresponding check, to avoid a 64bit-Cid postmaster to start on a 32bit-Cid data area and vice versa. I added the check but I needed to add it BEFORE checking for toast_max_chunk_size otherwise it complained about this more cryptic problem. I think it's cleaner to report this failure to know why toast_max_chunk_size != TOAST_MAX_CHUNK_SIZE. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ diff -dcrpN pgsql.orig/configure pgsql-cid64/configure *** pgsql.orig/configure 2008-03-02 13:44:42.0 +0100 --- pgsql-cid64/configure 2008-03-04 16:53:46.0 +0100 *** if test -n $ac_init_help; then *** 1349,1354 --- 1349,1355 Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] + --enable-huge-commandidenable 64-bit CommandId support --enable-integer-datetimes enable 64-bit integer date/time support --enable-nls[=LANGUAGES] enable Native Language Support --disable-shareddo not build shared libraries *** fi *** 2175,2180 --- 2176,2219 # + # 64-bit CommandId + # + echo $as_me:$LINENO: checking whether to build with 64-bit CommandId support 5 + echo $ECHO_N checking whether to build with 64-bit CommandId support... $ECHO_C 6 + + pgac_args=$pgac_args enable_huge_commandid + + # Check whether --enable-huge-commandid or --disable-huge-commandid was given. + if test ${enable_huge_commandid+set} = set; then + enableval=$enable_huge_commandid + + case $enableval in + yes) + + cat confdefs.h \_ACEOF + #define USE_64BIT_COMMANDID 1 + _ACEOF + + ;; + no) + : + ;; + *) + { { echo $as_me:$LINENO: error: no argument expected for --enable-huge-commandid option 5 + echo $as_me: error: no argument expected for --enable-huge-commandid option 2;} +{ (exit 1); exit 1; }; } + ;; + esac + + else + enable_huge_commandid=no + + fi; + + echo $as_me:$LINENO: result: $enable_huge_commandid 5 + echo ${ECHO_T}$enable_huge_commandid 6 + + # # 64-bit integer date/time storage (--enable-integer-datetimes) # { echo $as_me:$LINENO: checking whether to build with 64-bit integer date/time support 5 diff -dcrpN pgsql.orig/configure.in pgsql-cid64/configure.in *** pgsql.orig/configure.in 2008-03-02 13:44:43.0 +0100 --- pgsql-cid64/configure.in 2008-03-04 16:53:46.0 +0100 *** PGAC_ARG_REQ(with, libs, [ --with- *** 128,133 --- 128,142 # + # 64-bit CommandId + # + AC_MSG_CHECKING([whether to build with 64-bit CommandId support]) + PGAC_ARG_BOOL(enable, huge-commandid, no, [ --enable-huge-commandidenable 64-bit CommandId support], + [AC_DEFINE([USE_64BIT_COMMANDID], 1, + [Define to 1 if you want 64-bit CommandId support. (--enable-huge-commandid)])]) + AC_MSG_RESULT([$enable_huge_commandid]) + + # # 64-bit integer date/time storage (--enable-integer-datetimes) # AC_MSG_CHECKING([whether to build with 64-bit integer date/time support]) diff -dcrpN pgsql.orig/doc/src/sgml/installation.sgml pgsql-cid64/doc/src/sgml/installation.sgml *** pgsql.orig/doc/src/sgml/installation.sgml 2008-02-18 13:49:58.0 +0100 --- pgsql-cid64/doc/src/sgml/installation.sgml 2008-03-04 17:16:14.0 +0100 *** su - postgres *** 1011,1016 --- 1011,1027 /varlistentry varlistentry +termoption--enable-huge-commandid/option/term +listitem + para + Use 64-bit CommandIds if you are planning to run transactions + consisting of more than 4 billion commands. This is off by default + to save disk space. + /para +/listitem + /varlistentry + + varlistentry termoption--enable-integer-datetimes/option/term listitem para diff -dcrpN pgsql.orig/src/backend/access/transam/xact.c pgsql-cid64/src/backend/access/transam/xact.c *** pgsql.orig/src/backend/access/transam/xact.c 2008-01-15 19:56:59.0 +0100 --- pgsql-cid64/src/backend/access/transam/xact.c 2008-03-04 16:57:54.0 +0100 *** CommandCounterIncrement(void) *** 592,598 --- 592,602 currentCommandId -= 1; ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + #ifdef USE_64BIT_COMMANDID + errmsg(cannot have more than 2^64-1 commands in a transaction))); + #else errmsg(cannot have more than 2^32-1 commands in a transaction))); + #endif } currentCommandIdUsed = false; diff -dcrpN pgsql.orig/src/backend
Re: [PATCHES] WIP: plpgsql source code obfuscation
Hi, Pavel Stehule írta: On 29/01/2008, Peter Eisentraut [EMAIL PROTECTED] wrote: Am Montag, 28. Januar 2008 schrieb Pavel Stehule: this patch define new function flag - OBFUSCATE. With this flag encrypted source code is stored to probin column. Password is stored in GUC_SUPERUSER_ONLY item - it is similar security like SQL Server does (where privileged users can access system tables with source code or can use debugger). Have you thought about a solution that applies the regular access privileges to pg_proc in order to hide some content from less privileged users? it's second way, and maybe better. It can close way to table definitions too (and this request is adequate too). But you cannot to hide complete column, visibility depend on content and it can be slow, complex :(. Encrypt, decrypt aren't fast too. Pavel We made a similar encrypted plpgsql for a customer. It was a fork of plpgsql from 8.2.x and uses pgcrypto internally. Functions are cached the same way by the backend as regular plpgsql functions, hence fast. The hashkey of the cached function is the hash of the already encrypted function so it doesn't need to be decrypted every time it's looked up. Only the first run of a function is slower where it is needed to be decrypted for compilation. The pgcrypto dependency can be lifted and similar Obfuscate() and Deobfuscate() functions can be used as in the WIP patch posted here. The encrypted body is stored inside prosrc in our solution and dumpable/restorable just fine. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] Problem with pg_dump -n schemaname
Hi, we came across a problem when you want to dump only one schema. The ASCII output when loaded with psql into an empty database doesn't produce an identical schema to the original. The problem comes from this statement ordering: SET ... -- some initial DB parameters ... SET search_path = schemaname , pg_catalog; -- the above fails because no schema with this name exists -- as a consequence, the original search_path (e.g. $user, public) -- is not modified DROP INDEX schemaname.index1; ... DROP TABLE schemaname.table1; DROP SCHEMA schemaname; CREATE SCHEMA schemaname; ALTER SCHEMA schemaname OWNER TO schemaowner; CREATE TABLE table1; -- note that it was DROPped with full name schemaname.table1 ... So, because search_path is ' $user, public ' for e.g. postgres, the tables are created in the public schema. Hence, I propose the attached patch which issues SET search_path = ... statements before the first CREATE TABLE stmt in their respective schema instead of before the first DROP command. The problem manifests only when you dump only one schema. The same problem exists in at least 8.0.3, 8.2.5 and last 8.3cvs. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ --- postgresql-8.2.5.orig/src/bin/pg_dump/pg_backup_archiver.c 2007-08-06 03:38:24.0 +0200 +++ postgresql-8.2.5/src/bin/pg_dump/pg_backup_archiver.c 2007-11-16 11:00:46.0 +0100 @@ -241,9 +241,6 @@ { /* We want the schema */ ahlog(AH, 1, dropping %s %s\n, te-desc, te-tag); -/* Select owner and schema as necessary */ -_becomeOwner(AH, te); -_selectOutputSchema(AH, te-namespace); /* Drop it */ ahprintf(AH, %s, te-dropStmt); } @@ -275,6 +272,10 @@ { ahlog(AH, 1, creating %s %s\n, te-desc, te-tag); + /* Select owner and schema as necessary */ + _becomeOwner(AH, te); + _selectOutputSchema(AH, te-namespace); + _printTocEntry(AH, te, ropt, false, false); defnDumped = true; ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] HOT version 18
Hi, Pavan Deolasee írta: On 9/18/07, *Jaime Casanova* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: this sql scripts make current cvs + patch to crash with this message in the logs: Can you please check if the attached patch fixes the issue for you ? It sets t_tableOid before returning a HOT tuple to the caller. Thanks, Pavan I can confirm that the script crashed HOT v18 and your patch fixes it. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Standard compliant DEFAULT clause
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Or not, it's just a bitter and late (because of my bitterness) response to the rejection of my IDENTITY/GENERATED patches. Where's the much praised standard behaviour on standard syntax? So much for hypocrisy. Hm? There's a difference between extensions and failing to comply with what the spec says is the behavior of the syntax it provides. OK, that's where POVs and interpretations may differ. :-) The standard says one thing (allow these and only these kinds of expressions) which is a description of a behaviour, or can be interpreted as one. Now, if you allow others as well, is it an extension or failing to comply? :-) I have another question. How many features PostgreSQL have that copies other DMBS' behaviour (say, because of easy porting) and as such, differs slightly from the standard? Someone quoted DB2 during the early life of my patch, and it seems to me after reading DB2's online docs that GENERATED BY DEFAULT AS IDENTITY there behaves the was SERIAL behaves in PostgreSQL and the standard draft's text can be interpreted that way as well. I feel bad that you put so much work into what now seems to be a dead end (at least until we can get some clarification about what the committee really intends). But look at the bright side: you learned quite a bit about the innards of Postgres. Hopefully your next project will be more successful. regards, tom lane Thanks for the encouragement. I just needed to blow the the steam off somehow. Maybe the word hypocrisy was too harsh, sorry for that. I will shut up now. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] Standard compliant DEFAULT clause
Hi, here's a fix for a _very_ longstanding bug in PostgreSQL. According to SQL:2003 DEFAULT may only contain certain functional expressions and constant literals. Please, note the year of the standard. Or I know a better one, PostgreSQL is not even SQL92 compliant in this regard, after 14 years! http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt Please review and apply immediately. Or not, it's just a bitter and late (because of my bitterness) response to the rejection of my IDENTITY/GENERATED patches. Where's the much praised standard behaviour on standard syntax? So much for hypocrisy. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ --- pgsql.orig/src/backend/catalog/heap.c 2007-05-15 09:34:25.0 +0200 +++ pgsql/src/backend/catalog/heap.c 2007-05-18 21:33:04.0 +0200 @@ -1935,6 +1935,43 @@ errmsg(cannot use column references in default expression))); /* + * Make sure default expr may contain only + * standard compliant functions as of SQL:2003: + * - CURRENT_DATE + * - CURRENT_TIME[ ( precision ) ] + * - CURRENT_TIMESTAMP[ ( precision ) ] + * - LOCALTIME[ ( precision ) ] + * - LOCALTIMESTAMP[ ( precision ) ] + * - as a PostgreSQL extension, + * all others that call now() implicitely or explicitely + * - USER + * - CURRENT_USER + * - CURRENT_ROLE + * - SESSION_USER + * with two other PostgreSQL extensions: + * - nextval() so SERIALs work + * - any immutable functions to pave the way for GENERATED columns + * Please note that PostgreSQL lacks SYSTEM_USER and CURRENT_PATH. + */ + if (is_opclause(expr)) { + OpExpr *clause = (OpExpr *)expr; + + switch (clause-opfuncid) + { + case 745: /* current_user */ + case 746: /* session_user */ + case 1299: /* now() */ + case 1574: /* nextval() */ +break; + default: +if (contain_mutable_functions(expr)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg(cannot use non-IMMUTABLE functions in default expression))); + } + } + + /* * It can't return a set either. */ if (expression_returns_set(expr)) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PATCHES] New version of GENERATED/IDENTITY, was Re: parser dilemma
Hi, here's the patch with the modifications suggested by Tom Lane. The postfix rule was deleted from b_expr and the reverse parsing in ruleutils.c::get_oper_expr() always puts parentheses around postfix operators. Other changes: - OVERRIDING SYSTEM VALUE in COPY can appear at any place in the option list. - pg_dump was modified accordingly - \copy built-in in psql now also accepts OVERRIDING SYSTEM VALUE - documentation and testcase updates Please, review. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] New version of GENERATED/IDENTITY, was Re: parser dilemma
And here it is attached. Sorry. Zoltan Boszormenyi írta: Hi, here's the patch with the modifications suggested by Tom Lane. The postfix rule was deleted from b_expr and the reverse parsing in ruleutils.c::get_oper_expr() always puts parentheses around postfix operators. Other changes: - OVERRIDING SYSTEM VALUE in COPY can appear at any place in the option list. - pg_dump was modified accordingly - \copy built-in in psql now also accepts OVERRIDING SYSTEM VALUE - documentation and testcase updates Please, review. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ psql-serial-43.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] New version of GENERATED/IDENTITY, was Re: parser dilemma
Hi, some last changes. Really. :-) I made ALTER TABLE symmetric with CREATE TABLE so the grammar now has: ALTER TABLE tabname ALTER colname SET GENERATED { ALWAYS | BY DEFAULT} AS IDENTITY [ ( sequence options )] This works intuitively the same as in CREATE TABLE, i.e. - it creates an OWNED sequence (if the column doesn't already have one) - it creates or alters the sequence with the given options - adds the DEFAULT expression with the proper generation behaviour in one go. I extended the documentation and modified the test case accordingly. I also tested that an IDENTITY column can't be created with a type that cannot be cast from bigint i.e. box. I added it to the test case. Please, review. Best regards, Zoltán Böszörményi Zoltan Boszormenyi írta: And here it is attached. Sorry. Zoltan Boszormenyi írta: Hi, here's the patch with the modifications suggested by Tom Lane. The postfix rule was deleted from b_expr and the reverse parsing in ruleutils.c::get_oper_expr() always puts parentheses around postfix operators. Other changes: - OVERRIDING SYSTEM VALUE in COPY can appear at any place in the option list. - pg_dump was modified accordingly - \copy built-in in psql now also accepts OVERRIDING SYSTEM VALUE - documentation and testcase updates Please, review. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ psql-serial-44.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] parser dilemma
Tom Lane írta: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: So I think attaching a precedence to the GENERATED keyword is dangerous. Especially when we have a good workaround which would just require use of () around certain postfix-operator expressions. Yeah, I'm leaning to the idea that removing postfix operators from b_expr is the least bad solution. One thing that would have to be looked at is the rules in ruleutils.c for suppressing unnecessary parentheses when reverse-listing parsetrees. It might be safest to just never suppress them around a postfix operator. regards, tom lane You mean something like this? - *** *** 4138,4157 Oid opno = expr-opno; List *args = expr-args; - if (!PRETTY_PAREN(context)) - appendStringInfoChar(buf, '('); if (list_length(args) == 2) { /* binary operator */ Node *arg1 = (Node *) linitial(args); Node *arg2 = (Node *) lsecond(args); get_rule_expr_paren(arg1, context, true, (Node *) expr); appendStringInfo(buf, %s , generate_operator_name(opno, exprType(arg1), exprType(arg2))); get_rule_expr_paren(arg2, context, true, (Node *) expr); } else { --- 4095,4118 Oid opno = expr-opno; List *args = expr-args; if (list_length(args) == 2) { /* binary operator */ Node *arg1 = (Node *) linitial(args); Node *arg2 = (Node *) lsecond(args); + if (!PRETTY_PAREN(context)) + appendStringInfoChar(buf, '('); + get_rule_expr_paren(arg1, context, true, (Node *) expr); appendStringInfo(buf, %s , generate_operator_name(opno, exprType(arg1), exprType(arg2))); get_rule_expr_paren(arg2, context, true, (Node *) expr); + + if (!PRETTY_PAREN(context)) + appendStringInfoChar(buf, ')'); } else { *** *** 4169,4194 switch (optup-oprkind) { case 'l': appendStringInfo(buf, %s , generate_operator_name(opno, InvalidOid, exprType(arg))); get_rule_expr_paren(arg, context, true, (Node *) expr); break; case 'r': get_rule_expr_paren(arg, context, true, (Node *) expr); appendStringInfo(buf, %s, generate_operator_name(opno, exprType(arg), InvalidOid)); break; default: elog(ERROR, bogus oprkind: %d, optup-oprkind); } ReleaseSysCache(tp); } - if (!PRETTY_PAREN(context)) - appendStringInfoChar(buf, ')'); } /* --- 4130,4159 switch (optup-oprkind) { case 'l': + if (!PRETTY_PAREN(context)) + appendStringInfoChar(buf, '('); appendStringInfo(buf, %s , generate_operator_name(opno, InvalidOid, exprType(arg))); get_rule_expr_paren(arg, context, true, (Node *) expr); + if (!PRETTY_PAREN(context)) +
Re: [PATCHES] [HACKERS] parser dilemma
Andrew Dunstan írta: Zoltan Boszormenyi wrote: On the other hand, marking GENERATED as %right solves this issue. I hope it's an acceptable solution. If anything I should have thought it would be marked %nonassoc. cheers andrew That works, too. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] parser dilemma
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Andrew Dunstan írta: Zoltan Boszormenyi wrote: On the other hand, marking GENERATED as %right solves this issue. I hope it's an acceptable solution. If anything I should have thought it would be marked %nonassoc. That works, too. [ a bit alarmed... ] This is only going to be an acceptable solution if you can explain *exactly why* it works. The general story with associativity/precedence declarations is that you are making bison resolve ambiguous situations in particular ways. If you don't have a 100% clear understanding of what the ambiguity is and why this is the right way to resolve it, you are probably creating a bigger problem. regards, tom lane As far as I remember from my math classes, associativity is the rules about the way brackets are allowed to be used. Say, multiplication is two-way associative, i.e.: a * b * c == (a * b) * c == a * (b * c) If it was only left associative, the line below would be true: a * b * c == (a * b) * c != a * (b * c) Similarly, if it was only right-associative, this would be true: a * b * c == a * (b * c) != (a * b) * c Precedence is about the implicit bracketing above two operators, i.e. a * b + c * d == (a * b) + (c * d) (Sorry for the poor explanation, my math classes weren't in English.) So, before marking, bison was able to do this association: colname coltype ( DEFAULT 5! GENERATED ) ALWAYS ... after marking GENERATED as %right, it can only do this: colname coltype DEFAULT 5! ( GENERATED ALWAYS ... ) With marking GENERATED as %nonassoc, it cannot do either, leaving the only option for associating DEFAULT as: colname coltype (DEFAULT 5!) (GENERATED) ALWAYS ... So, do any of these cause any problems? -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Andrew Dunstan [EMAIL PROTECTED] writes: Zoltan Boszormenyi wrote: Thanks. This idea solved one of the two shift/reduce conflicts. But the other one can only be solved if I put GENERATED into the reserved_keyword set. But the standard spec says it's unreserved. Now what should I do with it? Yeah, I had a brief look. It's a bit ugly - the remaining conflict is in the b_expr rules. We do have the filtered_base_yylex() gadget - not sure if that can disambiguate for us. The problem comes from cases like colname coltype DEFAULT 5! GENERATED ... Since b_expr allows postfix operators, it takes one more token of lookahead than we have to tell if the default expression is 5! or 5!GENERATED There are basically two ways to fix this: 1. Collapse GENERATED ALWAYS and GENERATED BY into single tokens using filtered_base_yylex. 2. Stop allowing postfix operators in b_expr. I find #1 a bit icky --- not only does every case added to filtered_base_yylex slow down parsing a little more, but combined tokens create rough spots in the parser's behavior. As an example, both NULLS and FIRST are allegedly unreserved words, so this should work: regression=# create table nulls (x int); CREATE TABLE regression=# select first.* from nulls first; ERROR: syntax error at or near first LINE 1: select first.* from nulls first; ^ regression=# #2 actually seems like a viable alternative: postfix operators aren't really in common use, and doing this would not only fix GENERATED but let us de-reserve a few keywords that are currently reserved. In a non-exhaustive check I found that COLLATE, DEFERRABLE, and INITIALLY could become unreserved_keyword if we take out this production: *** 7429,7436 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } - | b_expr qual_Op%prec POSTFIXOP - { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | b_expr IS DISTINCT FROM b_expr%prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); --- 7550,7555 (Hmm, actually I'm wondering why COLLATE is a keyword at all right now... but the other two trace directly to the what-comes-after-DEFAULT issue.) regards, tom lane I looked at it a bit and tried to handle DEFAULT differently from other column constaints. Basically I did what the standard says: column definition ::= column name [ data type or domain name ] [ default clause | identity column specification | generation clause ] [ column constraint definition... ] [ collate clause ] So DEFAULT and GENERATED {BY DEFAULT | ALWAYS } AS { IDENTITY| GENERATED} is made mutually exclusive in the grammar and these must come before any other column constraints. This have one possible problem and one fix. The problem is that the DEFAULT clause cannot be mixed with other constraints any more, hence breaking some regression tests and lazy deployment. BTW the PostgreSQL documentation already says that DEFAULT must come after the type specification. The other is that specifying DEFAULT as a named constraint isn't allowed any more. See the confusion below. PostgreSQL happily accepts but forgets about the name I gave to the DEFAULT clause. db=# create table aaa (id integer not null constraint my_default default 5, t text); CREATE TABLE db=# \d aaa Tábla public.aaa Oszlop | Típus | Módosító +-+ id | integer | not null default 5 t | text| db=# alter table aaa drop constraint my_default ; ERROR: constraint my_default does not exist -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Zoltan Boszormenyi írta: Tom Lane írta: Andrew Dunstan [EMAIL PROTECTED] writes: Zoltan Boszormenyi wrote: Thanks. This idea solved one of the two shift/reduce conflicts. But the other one can only be solved if I put GENERATED into the reserved_keyword set. But the standard spec says it's unreserved. Now what should I do with it? Yeah, I had a brief look. It's a bit ugly - the remaining conflict is in the b_expr rules. We do have the filtered_base_yylex() gadget - not sure if that can disambiguate for us. The problem comes from cases like colname coltype DEFAULT 5! GENERATED ... Since b_expr allows postfix operators, it takes one more token of lookahead than we have to tell if the default expression is 5! or 5!GENERATED There are basically two ways to fix this: 1. Collapse GENERATED ALWAYS and GENERATED BY into single tokens using filtered_base_yylex. 2. Stop allowing postfix operators in b_expr. I find #1 a bit icky --- not only does every case added to filtered_base_yylex slow down parsing a little more, but combined tokens create rough spots in the parser's behavior. As an example, both NULLS and FIRST are allegedly unreserved words, so this should work: regression=# create table nulls (x int); CREATE TABLE regression=# select first.* from nulls first; ERROR: syntax error at or near first LINE 1: select first.* from nulls first; ^ regression=# #2 actually seems like a viable alternative: postfix operators aren't really in common use, and doing this would not only fix GENERATED but let us de-reserve a few keywords that are currently reserved. In a non-exhaustive check I found that COLLATE, DEFERRABLE, and INITIALLY could become unreserved_keyword if we take out this production: *** 7429,7436 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } - | b_expr qual_Op%prec POSTFIXOP - { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | b_expr IS DISTINCT FROM b_expr%prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); --- 7550,7555 (Hmm, actually I'm wondering why COLLATE is a keyword at all right now... but the other two trace directly to the what-comes-after-DEFAULT issue.) regards, tom lane I looked at it a bit and tried to handle DEFAULT differently from other column constaints. Basically I did what the standard says: column definition ::= column name [ data type or domain name ] [ default clause | identity column specification | generation clause ] [ column constraint definition... ] [ collate clause ] So DEFAULT and GENERATED {BY DEFAULT | ALWAYS } AS { IDENTITY| GENERATED} is made mutually exclusive in the grammar and these must come before any other column constraints. This have one possible problem and one fix. The problem is that the DEFAULT clause cannot be mixed with other constraints any more, hence breaking some regression tests and lazy deployment. BTW the PostgreSQL documentation already says that DEFAULT must come after the type specification. The other is that specifying DEFAULT as a named constraint isn't allowed any more. See the confusion below. PostgreSQL happily accepts but forgets about the name I gave to the DEFAULT clause. db=# create table aaa (id integer not null constraint my_default default 5, t text); CREATE TABLE db=# \d aaa Tábla public.aaa Oszlop | Típus | Módosító +-+ id | integer | not null default 5 t | text| db=# alter table aaa drop constraint my_default ; ERROR: constraint my_default does not exist Here's what it looks like now. Another side effect is that the check for conflicting DEFAULT clauses can now be deleted from analyze.c. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ psql-serial-41.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Zoltan Boszormenyi írta: Andrew Dunstan írta: Florian G. Pflug wrote: bison -y -d gram.y conflicts: 2 shift/reduce I'ts been quite a time since I last used bison, but as far as I remember, you can tell it to write a rather details log about it's analysis of the grammar. That log should include more detailed information about those conflicts - maybe that helps to figure out their exact cause, and to find a workaround. You can almost always get rid of shift/reduce conflicts by unwinding some of the productions - resist the temptation to factor the grammar. The effect of this is to eliminate places where the parser has to decide between shifting and reducing. (This is why, for example, almost all the drop foo if exists variants require separate productions rather than using opt_if_exists.) cheers andrew Thanks. This idea solved one of the two shift/reduce conflicts. But the other one can only be solved if I put GENERATED into the reserved_keyword set. But the standard spec says it's unreserved. Now what should I do with it? What should I do? Send it. :-) Someone more familiar with bison can hopefully fix it... Please, review. Best regards, Zoltán -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ psql-serial-40.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: So, I should allow DROP DEFAULT, implement SET DEFAULT GENERATED ALWAYS AS and modify the catalog so the GENERATED property is part of pg_attrdef. Sounds good. Finally here it is. What about IDENTITY? Should it also be part of pg_attrdef? There are two ways to implement it: have or don't have a notion of it. The latter would treat GENERATED BY DEFAULT AS IDENTITY the same as SERIAL. Is there any good reason to distinguish the two? Actually, I needed to have a flag for IDENTITY but not for the reason above. I need it to distinguish between GENERATED ALWAYS AS IDENTITY and GENERATED ALWAYS AS ( expr ). Changes: - Rewritten the GENERATED/IDENTITY flags to be part of the default pg_attrdef This made the patch MUCH smaller. - SERIALs are now the same as INTEGER GENERATED BY DEFAULT AS IDENTITY - Allow DROP DEFAULT on GENERATED/IDENTITY columns - Implemented SET GENERATED ALWAYS AS - Modified syntax of SET GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY so it reads as SET IDENTITY GENERATED {ALWAYS | BY DEFAULT} so compiling gram.y/gram.c doesn't give me errors. This DDL statement isn't part of SQL:2003 so it might be accepted as a PostgreSQL extension. - Modified behaviour of SET IDENTITY to also restore the DEFAULT expression. Someone might have done did a DROP DEFAULT before but kept the OWNED sequence. - Fixed behaviour of GENERATED columns regarding INSERT ... OVERRIDING SYSTEM VALUE and only those GENERATED columns get UPDATEd that are either explicitly modified with SET column = DEFAULT or one of their referenced columns are modified. - Testcase and documentation is modified to reflect the above. Please, review. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ psql-serial-39.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Zoltan Boszormenyi írta: Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: So, I should allow DROP DEFAULT, implement SET DEFAULT GENERATED ALWAYS AS and modify the catalog so the GENERATED property is part of pg_attrdef. Sounds good. Finally here it is. What about IDENTITY? Should it also be part of pg_attrdef? There are two ways to implement it: have or don't have a notion of it. The latter would treat GENERATED BY DEFAULT AS IDENTITY the same as SERIAL. Is there any good reason to distinguish the two? Actually, I needed to have a flag for IDENTITY but not for the reason above. I need it to distinguish between GENERATED ALWAYS AS IDENTITY and GENERATED ALWAYS AS ( expr ). Changes: - Rewritten the GENERATED/IDENTITY flags to be part of the default pg_attrdef This made the patch MUCH smaller. - SERIALs are now the same as INTEGER GENERATED BY DEFAULT AS IDENTITY - Allow DROP DEFAULT on GENERATED/IDENTITY columns - Implemented SET GENERATED ALWAYS AS - Modified syntax of SET GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY so it reads as SET IDENTITY GENERATED {ALWAYS | BY DEFAULT} so compiling gram.y/gram.c doesn't give me errors. This DDL statement isn't part of SQL:2003 so it might be accepted as a PostgreSQL extension. - Modified behaviour of SET IDENTITY to also restore the DEFAULT expression. Someone might have done did a DROP DEFAULT before but kept the OWNED sequence. - Fixed behaviour of GENERATED columns regarding INSERT ... OVERRIDING SYSTEM VALUE and only those GENERATED columns get UPDATEd that are either explicitly modified with SET column = DEFAULT or one of their referenced columns are modified. - Testcase and documentation is modified to reflect the above. - Also allowed UPDATE on IDENTITY columns. Please, review. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: [ IDENTITY/GENERATED patch ] I got around to reviewing this today. Thanks for the review. Sorry for the bit late reply, I was ill and then occupied with some other work. - unique index checks are done in two steps to avoid inflating the sequence if a unique index check is failed that doesn't reference the IDENTITY column This is just not acceptable --- there is nothing in the standard that requires such behavior, But also there is nothing that would say not to do it. :-) And this way, there is be nothing that would separate IDENTITY from regular SERIALs only the slightly later value generation. The behaviour I proposed would be a big usability plus over the standard with less possible skipped values. and I dislike the wide-ranging kluges you introduced to support it. Can you see any other way to avoid skipping sequence values as much as possible? Please get rid of that and put the behavior back into ordinary DEFAULT-substitution where it belongs. You mean put the IDENTITY generation into rewriteTargetList()? And what about the action at a distance behaviour you praised so much before? (Which made the less-skipping behaviour implementable...) Anyway, I put it back. But it brought the consequence that GENERATED fields may reference IDENTITY columns, too, so I removed this limitation as well. - to minimize runtime impact of checking whether an index references the IDENTITY column and skipping it in the first step in ExecInsertIndexTuples(), I introduced a new attribute in the pg_index catalog. This is likewise unreasonably complex and fragile ... but it goes away anyway if you remove the above, no? Yes. The patch appears to believe that OVERRIDING SYSTEM VALUE should be restricted to the table owner, but I don't actually see any support for that in the SQL2003 spec ... where did you get that from? Somehow it felt wrong to allow everybody to use it. Limit removed. I'm pretty dubious about the kluges in aclchk.c to automatically grant/revoke on dependent sequences --- particularly the revoke part. The problem with that is that it breaks if the same sequence is being used to feed multiple tables. OK, removed. User-facing errors need to be ereport() not elog() so that they can be translated and have appropriate SQLSTATEs reported. elog is only for internal errors. OK, changed. One other thought is that the field names based on force_default seemed confusing. I'd suggest that maybe generated would be a better name choice. I modified the names. force_default - is_identity, attforceddef - attgenerated I also fixed COPY without OVERRIDING SYSTEM VALUE regarding IDENTITY and GENERATED fields and modified the docs and the testcase according to your requested modifications. Please fix and resubmit. regards, tom lane Thanks again for the review. Here's the new version with the modifications you requested. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ psql-serial-38.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Andrew Dunstan írta: Zoltan Boszormenyi wrote: Tom Lane írta: - unique index checks are done in two steps to avoid inflating the sequence if a unique index check is failed that doesn't reference the IDENTITY column This is just not acceptable --- there is nothing in the standard that requires such behavior, But also there is nothing that would say not to do it. :-) And this way, there is be nothing that would separate IDENTITY from regular SERIALs only the slightly later value generation. The behaviour I proposed would be a big usability plus over the standard with less possible skipped values. and I dislike the wide-ranging kluges you introduced to support it. Can you see any other way to avoid skipping sequence values as much as possible? This doesn't strike me as a sensible design goal. Why not just live with skipped values? cheers andrew The idea wouldn't have considered to cross my mind if Tom didn't mention the action-at-a-distance behaviour. If you look back in the archives, my first working IDENTITY was nothing more than the syntactic sugar over the regular SERIAL. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: The idea wouldn't have considered to cross my mind if Tom didn't mention the action-at-a-distance behaviour. AFAIR, that remark had to do with NEXT VALUE FOR, which SQL2003 defines in a very weird way (it's not equivalent to nextval() as one would wish). I don't see anything strange in the spec for GENERATED. regards, tom lane Thanks for clarifying. Please review the version I sent you. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: I wrote: I see another problem with this patch: the code added to ATExecDropColumn is a crude hack. It doesn't work anyway since this is not the only possible way for columns to be dropped (another one that comes to mind immediately is DROP TYPE ... CASCADE). The only correct way to handle things is to let the dependency mechanism do it. I will try that. Actually, the whole question of dependencies for generated columns probably needs some thought. What should happen if a function or operator used in a GENERATED expression gets dropped? The result for a normal column's default expression is that the default expression goes away, but the column is still there. I suspect we don't want that for a GENERATED column --- it would then be effectively a plain column. No, I would want the DROP FUNCTION to be cancelled if used in a GENERATED, but a DROP ... CASCADE would drop it, too. So, DEPENDENCY_NORMAL will keep the referencing object but DEPENDENCY_AUTO would drop it too if the referenced object is dropped? Along the same lines, is ALTER COLUMN DROP DEFAULT a legal operation on a generated column? What about just replacing the expression with ALTER COLUMN SET DEFAULT? Neither of these options are legal for GENERATED columns, AFAIK I prohibited them already. Before you get too excited about making generated columns disappear automatically in all these cases, consider that dropping a column is not something to be done lightly --- it might contain irreplaceable data. The standard says that the GENERATED column should be dropped silently if either of the referenced columns is dropped. I haven't seen anything about the expression, though. On second thought maybe the right approach is just to allow the default expression to be dropped the same as it would be for an ordinary column, and to make sure that if a GENERATED column doesn't (currently) have a default, it is treated the same as an ordinary column. This leads to the further thought that maybe GENERATED is not a property of a column at all, but of its default (ie, it should be stored in pg_attrdef not pg_attribute, which would certainly make the patch less messy). AFAICS plain GENERATED merely indicates that the default expression can depend on other columns, which is certainly a property of the default --- you could imagine ALTER COLUMN SET DEFAULT GENERATED AS ... to make a formerly plain column into a GENERATED one. I'm not entirely sure about ALWAYS though. The standard says somewhere that GENERATED columns can only be added to and dropped from a table. My observation is: I deleted my hack from ATExecDropColumn() and now I cannot drop the referenced column without CASCADE. The comment in StoreAttrDefault() says the objects in the expression will have dependencies registered. I guess objects also means functions? This way, if I do explicit recordDependencyOn(, DEPENDENCY_AUTO) on referenced columns then the standard requirement will be satisfied, i.e. dropping columns will drop GENERATED columns silently that reference the said column but . Am I right? Or how about using recordDependencyOnSingleRelExpr(... , DEP_NORMAL, DEP_AUTO ) ? regards, tom lane -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Before you get too excited about making generated columns disappear automatically in all these cases, consider that dropping a column is not something to be done lightly --- it might contain irreplaceable data. The standard says that the GENERATED column should be dropped silently if either of the referenced columns is dropped. [ itch... ] I think a pretty good case could be made for ignoring that provision, on the grounds that it's a foot-gun, and that it's not very important to follow the standard slavishly on this point because it's hard to conceive of any application actually relying on that behavior. You could probably implement the auto-drop behavior with some combination of (a) AUTO rather than NORMAL dependencies from the default expression to the stuff it depends on and (b) INTERNAL rather than AUTO dependency from the default expression to its column. But I really question whether this is a good idea. So, all dependency should be NORMAL to require manual CASCADE to avoid accidental data loss. I have two questions about the dependency system. 1. Is there a built-in defense to avoid circular dependencies? 2. If I register dependencies between column, is there a way to retrieve all table/column type dependencies for a depender column? What I would like to achieve is to lift the limit that a GENERATED column cannot reference another one. Only self-referencing should be disallowed. The standard says somewhere that GENERATED columns can only be added to and dropped from a table. This argument carries no weight at all --- there is plenty of stuff in PG that is an extension of the capabilities listed in the spec. Point taken. So, just like with SET / DROP IDENTITY, I should implement SET GENERATED ALWAYS and DROP GENERATED. regards, tom lane -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: I have two questions about the dependency system. 1. Is there a built-in defense to avoid circular dependencies? It doesn't have a problem with them, if that's what you mean. 2. If I register dependencies between column, is there a way to retrieve all table/column type dependencies for a depender column? You can scan pg_depend. What I would like to achieve is to lift the limit that a GENERATED column cannot reference another one. I would counsel not doing that, mainly because then you will have to solve an evaluation-order problem at runtime. OK. Point taken. So, just like with SET / DROP IDENTITY, I should implement SET GENERATED ALWAYS and DROP GENERATED. If you think of it as a property of the default expression, then DROP DEFAULT covers both cases, you don't need DROP GENERATED... So, I should allow DROP DEFAULT, implement SET DEFAULT GENERATED ALWAYS AS and modify the catalog so the GENERATED property is part of pg_attrdef. What about IDENTITY? Should it also be part of pg_attrdef? There are two ways to implement it: have or don't have a notion of it. The latter would treat GENERATED BY DEFAULT AS IDENTITY the same as SERIAL. regards, tom lane -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: So, I should allow DROP DEFAULT, implement SET DEFAULT GENERATED ALWAYS AS and modify the catalog so the GENERATED property is part of pg_attrdef. Sounds good. What about IDENTITY? Should it also be part of pg_attrdef? There are two ways to implement it: have or don't have a notion of it. The latter would treat GENERATED BY DEFAULT AS IDENTITY the same as SERIAL. Is there any good reason to distinguish the two? Yes. Plain SERIALs can be updated with given values whereas IDENTITY columns cannot. And there is the difference between GENERATED and GENERATED IDENTITY: GENERATED columns can updated with DEFAULT values, IDENTITY columns cannot. I strictly have to distinguish IDENTITY from both GENERATED and plain SERIALs. regards, tom lane -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Hi! Thanks. However, in the meantime I made some changes so the IDENTITY column only advances its sequence if it fails its CHECK constraints or UNIQUE indexes. I still have some work with expression indexes. Should I post an incremental patch against this version or a full patch when it's ready? An incremental patch can still be posted when the feature is agreed to be in 8.3 and actually applied. It only changes some details in the new feature and doesn't change behaviour of existing features. Best regards, Zoltán Böszörményi Bruce Momjian írta: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Zoltan Boszormenyi wrote: Hi, I think now this is really the final version. Changes in this version is: - when dropping a column that's referenced by a GENERATED column, the GENERATED column has to be also dropped. It's required by SQL:2003. - COPY table FROM works correctly with IDENTITY and GENERATED columns - extended testcase to show the above two To reiterate all the features that accumulated over time, here's the list: - extended catalog (pg_attribute) to keep track whether the column is IDENTITY or GENERATED - working GENERATED column that may reference other regular columns; it extends the DEFAULT infrastructure to allow storing complex expressions; syntax for such columns: colname type GENERATED ALWAYS AS ( expression ) - working IDENTITY column whose value is generated after all other columns (regular or GENERATED) are assigned with values and validated via their NOT NULL and CHECK constraints; this allows tighter numbering - the only case when there may be missing serials are when UNIQUE indexes are failed (which is checked on heap_insert() and heap_update() and is a tougher nut to crack) syntax is: colname type GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence options ) ] the original SERIAL pseudo-type is left unmodified, the IDENTITY concept is new and extends on it - PostgreSQL may have multiple SERIAL columns in a table, but SQL:2003 requires that at most one IDENITY column may exist in a table at any time - Implemented the following TODOs: - %Have ALTER TABLE RENAME rename SERIAL sequence names - Allow SERIAL sequences to inherit permissions from the base table? Actually the roles that have INSERT or UPDATE permissions on the table gain permission on the sequence, too. This makes the following TODO unneeded: - Add DEFAULT .. AS OWNER so permission checks are done as the table owner This would be useful for SERIAL nextval() calls and CHECK constraints. - DROP DEFAULT is prohibited on GENERATED and IDENTITY columns - One SERIAL column can be upgraded to IDENTITY via ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY Same for downgrading, via: ALTER COLUMN column DROP IDENTITY - COPY and INSERT may use OVERRIDING SYSTEM VALUE clause to override automatic generation and allow to import dumped data unmodified - Update is forbidden for GENERATED ALWAYS AS IDENTITY columns entirely and for GENERATED ALWAYS AS (expr) columns for other values than DEFAULT. - ALTER COLUMN SET sequence options for altering the supporting sequence; works on any SERIAL-like or IDENTITY columns - ALTER COLUMN RESTART [WITH] N for changing only the next generated number in the sequence. - The essence of pg_get_serial_sequence() is exported as get_relid_att_serial_sequence() to be used internally by checks. - CHECK constraints cannot reference IDENTITY or GENERATED columns - GENERATED columns cannot reference IDENTITY or GENERATED columns - dropping a column that's referenced by a GENERATED column also drops the GENERATED column - pg_dump dumps correct schema for IDENTITY and GENERATED columns: - ALTER COLUMN SET GENERATED ... AS IDENTITY for IDENTITY columns after ALTER SEQUENCE OWNED BY - correct GENERATED AS ( expression ) caluse in the table schema - pg_dump dumps COPY OVERRIDING SYSTEM VALUE for tables' date that have any GENERATED or GENERATED ALWAYS AS IDENTITY columns. - documentation and testcases Please, review. Best regards, Zolt?n B?sz?rm?nyi [ application/x-tar is not supported, skipping... ] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Hi, I think now this is really the final version. Changes in this version is: - when dropping a column that's referenced by a GENERATED column, the GENERATED column has to be also dropped. It's required by SQL:2003. - COPY table FROM works correctly with IDENTITY and GENERATED columns - extended testcase to show the above two To reiterate all the features that accumulated over time, here's the list: - extended catalog (pg_attribute) to keep track whether the column is IDENTITY or GENERATED - working GENERATED column that may reference other regular columns; it extends the DEFAULT infrastructure to allow storing complex expressions; syntax for such columns: colname type GENERATED ALWAYS AS ( expression ) - working IDENTITY column whose value is generated after all other columns (regular or GENERATED) are assigned with values and validated via their NOT NULL and CHECK constraints; this allows tighter numbering - the only case when there may be missing serials are when UNIQUE indexes are failed (which is checked on heap_insert() and heap_update() and is a tougher nut to crack) syntax is: colname type GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence options ) ] the original SERIAL pseudo-type is left unmodified, the IDENTITY concept is new and extends on it - PostgreSQL may have multiple SERIAL columns in a table, but SQL:2003 requires that at most one IDENITY column may exist in a table at any time - Implemented the following TODOs: - %Have ALTER TABLE RENAME rename SERIAL sequence names - Allow SERIAL sequences to inherit permissions from the base table? Actually the roles that have INSERT or UPDATE permissions on the table gain permission on the sequence, too. This makes the following TODO unneeded: - Add DEFAULT .. AS OWNER so permission checks are done as the table owner This would be useful for SERIAL nextval() calls and CHECK constraints. - DROP DEFAULT is prohibited on GENERATED and IDENTITY columns - One SERIAL column can be upgraded to IDENTITY via ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY Same for downgrading, via: ALTER COLUMN column DROP IDENTITY - COPY and INSERT may use OVERRIDING SYSTEM VALUE clause to override automatic generation and allow to import dumped data unmodified - Update is forbidden for GENERATED ALWAYS AS IDENTITY columns entirely and for GENERATED ALWAYS AS (expr) columns for other values than DEFAULT. - ALTER COLUMN SET sequence options for altering the supporting sequence; works on any SERIAL-like or IDENTITY columns - ALTER COLUMN RESTART [WITH] N for changing only the next generated number in the sequence. - The essence of pg_get_serial_sequence() is exported as get_relid_att_serial_sequence() to be used internally by checks. - CHECK constraints cannot reference IDENTITY or GENERATED columns - GENERATED columns cannot reference IDENTITY or GENERATED columns - dropping a column that's referenced by a GENERATED column also drops the GENERATED column - pg_dump dumps correct schema for IDENTITY and GENERATED columns: - ALTER COLUMN SET GENERATED ... AS IDENTITY for IDENTITY columns after ALTER SEQUENCE OWNED BY - correct GENERATED AS ( expression ) caluse in the table schema - pg_dump dumps COPY OVERRIDING SYSTEM VALUE for tables' date that have any GENERATED or GENERATED ALWAYS AS IDENTITY columns. - documentation and testcases Please, review. Best regards, Zoltán Böszörményi psql-serial-34.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Resending compressed, it seems pgsql-patches doesn't let me post so large patches. Zoltan Boszormenyi írta: Hi, I have finished my GENERATED/IDENTITY patch and now it does everything I wanted it to do. Changes from the previous version: - GENERATED columns work now - extended testcase to test some GENERATED expressions - extended documentation Now it comes in an uncompressed context diff form, out of habit I sent unified diffs before, sorry for that. It applies to current CVS. Please, review. Thanks in advance, Zoltán Böszörményi psql-serial-33.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PATCHES] IDENTITY/GENERATED v31
Hi, here comes the next version. Changes: * Implemented ALTER TABLE ... DROP IDENTITY and SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY * Extended documentation and testcase for the above. * pg_dump now dumps the proper ALTER for such columns. Reintroduced the pg_dump feature to dump OVERRIDING SYSTEM VALUE for INSERT and COPY statements if the table schema contains any GENERATED ALWAYS column. * There can be only one IDENTITY column in a table at any time. This is checked on CREATE, ALTER TABLE ADD and ALTER TABLE SET ... IDENTITY and required for SQL:2003 compliance. * On update, the GENERATED columns' late generation of DEFAULT value is also implemented. The GENERATED ALWAYS AS ( expr ) still only deals within the same limits as regular DEFAULT. * Implemented a check so CHECK constraints cannot reference IDENTITY/GENERATED columns. It is checked on CREATE AND ALTER TABLE ADD CONSTRAINT and ALTER TABLE ADD COLUMN. It makes sense because IDENTITY and GENERATED columns has to be assigned after all other columns were assigned with values and validated via constraints. I hereby declare the IDENTITY part feature complete. Please review. Best regards, Zoltán Böszörményi psql-serial-31.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PATCHES] New version of IDENTITY/GENERATED
Hi, I started working on my previous patch, encouraged by the fact that it became a wishlist item for 8.3. :-) The changes in this version are: - Refreshed to almost current (5 days old) CVS version of 8.3 devel - The original SERIAL pseudo type is left alone, you _have to_ spell out GENERATED { ALWAYS | BY DEFAULT} AS IDENTITY to get an identity column. - The action-at-a-distance behaviour is actually working for the IDENTITY/GENERATED columns on INSERT so the DEFAULT value is generated for them after all the regular columns were validated via ExecConstraints(). This way, if the validation fails, the sequence isn't inflated. - Test case is updated to reflect the above. - Documentation is updated, Identity columns have a new subsection now. - Dropped my pg_dump changes, as the altered sequence is also dumped in 8.2, thanks to Tom Lane. I am considering the following: - Since the IDENTITY is a new feature (plain old SERIAL behaves the same as always) I will restore the SQL:2003 confromant check that there can be only one identity column in a table at any time. - I read somewhere (but couldn't find it now in SQL:2003) that CHECK constraints cannot be defined for GENERATED (and IDENTITY?) columns. Maybe it was in the latest draft, I have to look at it... Anyway, I have to implement checks to disallow CHECKs for such columns. - Introduce an ALTER TABLE SET|DROP IDENTITY so a serial can be upgraded to an identity. This way, an identity column can be built by hand and pg_dump will need it, too. SET IDENTITY will either have to issue an error if CHECKs defined for such columns or automatically drop every such constraints. And I have a question, too. Is there a way to use ExecEvalExpr*() so values from a given tuples are used for current row? E.g. at present, UPDATE table SET f1 = f1 + 1, f2 = f1 + 1; sets both fields' new value to (f1 value before UPDATE) + 1. For a GENERATED column, value _after_ UPDATE is needed, so CREATE TABLE table ( f1 INTEGER, f2 INTEGER GENERATED ALWAYS AS (f1 + 1)); and no matter which one of the following is used: UPDATE table SET f1 = f1 + 1; or UPDATE table SET f1 = f1 + 1, f2 = default; the f2 current value = f1 current value + 1 is always maintained. Best regards, Zoltán Böszörményi psql-serial-30.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Thanks!!! Tom Lane írta: =?iso-8859-2?Q?B=F6sz=F6rm=E9nyi_Zolt=E1n?= [EMAIL PROTECTED] writes: as per your suggestion, the COPY view TO support was cut and a hint was added. Please, review. Committed after some refactoring to avoid code duplication. Unfortunately, in a moment of pure brain fade, I looked at the wrong item in my inbox and wrote Bernd Helmle's name instead of yours in the commit message :-(. My sincere apologies. Bruce, would you make a note to be sure the right person gets credit in the release notes? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT)
Bruce Momjian írta: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Zoltan Boszormenyi wrote: My v8 had the syntax support for COPY (SELECT ...) (col1, col2, ...) TO and it was actually working. In your v9 you rewrote the syntax parsing so that feature was lost in translation. Interesting. I didn't realize this was possible -- obviously I didn't test it (did you have a test for it in the regression tests? I may have missed it). In fact, I deliberately removed the column list from the grammar, because it can certainly be controlled inside the SELECT, so I thought there was no reason the support controlling it in the COPY column list. I would vote against allowing a column list here, because it's useless and it strikes me as likely to result in strange syntax error messages if the user makes any little mistake. What worries me is that the above looks way too nearly like a function call, which means that for instance if you omit a right paren somewhere in the SELECT part, you're likely to get a syntax error that points far to the right of the actual mistake. The parser could also mistake the column list for a table-alias column list. Specifying a column list with a view name is useful, of course, but what is the point when you are writing out a SELECT anyway? If you don't support COPY view TO, at least return an error messsage that suggests using COPY (SELECT * FROM view). And if you support COPY VIEW, you are going to have to support a column list for that. Is that additional complexity in COPY? If so, it might be a reason to just throw an error on views and do use COPY SELECT. No, it oes not have any additional complexity, it uses the same code COPY tablename TO uses. Seeing that COPY VIEW only supports TO, not FROM, and COPY SELECT support only TO, not FROM, it seems logical for COPY to just support relations, and COPY SELECT to be used for views, if we can throw an error on COPY VIEW to tell people to use COPY SELECT. The additional hint would be enough if the VIEW case is not supported. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Andrew Dunstan írta: Alvaro Herrera wrote: Böszörményi Zoltán wrote: Hi, what's the problem with COPY view TO, other than you don't like it? :-) The problem is that it required a ugly piece of code. Not supporting it means we can keep the code nice. The previous discussion led to this conclusion anyway so I don't know why we are debating it again. What is so ugly about it? I haven't looked at the code, but I am curious to know. I also don't recall the consensus being quite so clear cut. I guess there is a case for saying that if it's not allowed then you know that COPY relname TO is going to be fast. But, code aesthetics aside, the reasons for disallowing it seem a bit thin, to me. cheers andrew I would say the timing difference between COPY table TO and COPY (SELECT * FROM table) TO was noise, so it's not even faster. And an updatable VIEW *may* allow COPY view FROM... Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Alvaro Herrera írta: Zoltan Boszormenyi wrote: Andrew Dunstan írta: Alvaro Herrera wrote: Böszörményi Zoltán wrote: what's the problem with COPY view TO, other than you don't like it? :-) The problem is that it required a ugly piece of code. Not supporting it means we can keep the code nice. The previous discussion led to this conclusion anyway so I don't know why we are debating it again. What is so ugly about it? I haven't looked at the code, but I am curious to know. It used a SELECT * FROM %s string that was passed back to the parser. I also don't recall the consensus being quite so clear cut. I guess there is a case for saying that if it's not allowed then you know that COPY relname TO is going to be fast. But, code aesthetics aside, the reasons for disallowing it seem a bit thin, to me. I would say the timing difference between COPY table TO and COPY (SELECT * FROM table) TO was noise, so it's not even faster. Remember that we were talking about supporting views, not tables. And if a view uses a slow query then you are in immediate danger of having a slow COPY. This may not be a problem but it needs to be discussed and an agreement must be reached before introducing such a change (and not during feature freeze). COPY relname TO meant tables _and_ views to me. My previous tsting showed no difference between COPY table TO and COPY (SELECT * FROM table) TO. Similarly a slow query defined in the view should show no difference between COPY view TO and COPY (SELECT * FROM view) TO. And remember, Bruce put the original COPY view TO patch into the unapplied queue, without the SELECT feature. Rewriting COPY view TO internally to COPY (SELECT * FROM view) TO is very straightforward, even if you think it's ugly. BTW, why is it ugly if I call raw_parser() from under src/backend/parser/*.c ? It is on a query distinct to the query the parser is currently running. Or is it the recursion that bothers you? It's not a possible infinite recursion. And an updatable VIEW *may* allow COPY view FROM... May I remind you that we've been in feature freeze for four weeks already? Now it's *not* the time to be drooling over cool features that would be nice to have Noted. However, as the COPY view TO is a straight internal rewrite, a COPY view FROM could also be. Even if it's a long term development. I wasn't proposing delaying beta. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Alvaro Herrera írta: Zoltan Boszormenyi wrote: Alvaro Herrera írta: Remember that we were talking about supporting views, not tables. And if a view uses a slow query then you are in immediate danger of having a slow COPY. This may not be a problem but it needs to be discussed and an agreement must be reached before introducing such a change (and not during feature freeze). COPY relname TO meant tables _and_ views to me. My previous tsting showed no difference between COPY table TO and COPY (SELECT * FROM table) TO. Similarly a slow query defined in the view should show no difference between COPY view TO and COPY (SELECT * FROM view) TO. The difference is that we are giving a very clear distinction between a table and a view. If we don't support the view in the direct COPY, but instead insist that it be passed via a SELECT query, then the user will be aware that it may be slow. It still can be documented with supporting the COPY view TO syntax. But COPY view (col1, col2, ...) TO may still be useful even if the COPY (SELECT ...) (col1, col2, ...) TO is pointless. [1] relname at this point may mean anything -- are you supporting sequences and toast tables as well? Well, not really. :-) And remember, Bruce put the original COPY view TO patch into the unapplied queue, without the SELECT feature. All sort of junk enters that queue so that's not an argument. (Not meant to insult Bruce -- I'm just saying that he doesn't filter stuff. We've had patches rejected from the queue before plenty of times.) OK. :-) Rewriting COPY view TO internally to COPY (SELECT * FROM view) TO is very straightforward, even if you think it's ugly. BTW, why is it ugly if I call raw_parser() from under src/backend/parser/*.c ? It is on a query distinct to the query the parser is currently running. Or is it the recursion that bothers you? It's not a possible infinite recursion. It's ugly because you are forcing the system to parse something that was already parsed. Well, to be true to the word, during parsing COPY view TO the parser never saw SELECT * FROM view. On the other hand I don't see why you are arguing in favor of a useless feature whose coding is dubious; you can have _the same thing_ with nice code and no discussion. Because of [1] and because Mr. Schoenig's arguments about changing schemas. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Alvaro Herrera írta: Zoltan Boszormenyi wrote: Alvaro Herrera írta: But COPY view (col1, col2, ...) TO may still be useful even if the COPY (SELECT ...) (col1, col2, ...) TO is pointless. [1] Hum, I don't understand what you're saying here -- are you saying that you can't do something with the first form, that you cannot do with the second? Say you have a large often used query. Would you like to retype it every time or just create a view? Later you may want to export only a subset of the fields... My v8 had the syntax support for COPY (SELECT ...) (col1, col2, ...) TO and it was actually working. In your v9 you rewrote the syntax parsing so that feature was lost in translation. It's ugly because you are forcing the system to parse something that was already parsed. Well, to be true to the word, during parsing COPY view TO the parser never saw SELECT * FROM view. Hmm! The COPY view stuff stopped working when I changed back the relation case. Your patch changed it so that instead of flowing as RangeVar all the way to the copy.c code, the parser changed it into a select * from %s query, and then stashed the resulting Query node into the query %case. (So what was happening was that the Relation case was never %used). I reverted this. Well, the VIEW case wasn't supported before so I took the opportunity to transform it in analyze.c which you deleted as being ugly. On the other hand I don't see why you are arguing in favor of a useless feature whose coding is dubious; you can have _the same thing_ with nice code and no discussion. Because of [1] and because Mr. Schoenig's arguments about changing schemas. Yeah, that argument makes sense to me as well. So, may I put it back? :-) Also, can you suggest anything cleaner than calling raw_parser(SELECT * FROM view)? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Alvaro Herrera írta: Zoltan Boszormenyi wrote: Alvaro Herrera írta: Zoltan Boszormenyi wrote: Alvaro Herrera írta: But COPY view (col1, col2, ...) TO may still be useful even if the COPY (SELECT ...) (col1, col2, ...) TO is pointless. [1] Hum, I don't understand what you're saying here -- are you saying that you can't do something with the first form, that you cannot do with the second? Say you have a large often used query. Would you like to retype it every time or just create a view? Later you may want to export only a subset of the fields... My v8 had the syntax support for COPY (SELECT ...) (col1, col2, ...) TO and it was actually working. In your v9 you rewrote the syntax parsing so that feature was lost in translation. Interesting. I didn't realize this was possible -- obviously I didn't test it (did you have a test for it in the regression tests? I may have missed it). In fact, I deliberately removed the column list from the grammar, because it can certainly be controlled inside the SELECT, so I thought there was no reason the support controlling it in the COPY column list. Yes, it was even documented. I thought about having queries stored statically somewhere (not in views) and being able to use only part of the result. I don't think it's difficult to put it back. But this has nothing to do with COPY view, does it? No, but it may be confusing seeing COPY (SELECT ) (col1, col2, ...) TO instead of COPY (SELECT col1, col2, ...) TO. With the COPY VIEW (col1, col2, ...) TO syntax it may be cleaner from the user's point of view. Together with the changing schemas argument it gets more and more tempting. On the other hand I don't see why you are arguing in favor of a useless feature whose coding is dubious; you can have _the same thing_ with nice code and no discussion. Because of [1] and because Mr. Schoenig's arguments about changing schemas. Yeah, that argument makes sense to me as well. So, may I put it back? :-) Also, can you suggest anything cleaner than calling raw_parser(SELECT * FROM view)? I think at this point is someone else's judgement whether you can put it back or not. Tom already said that he doesn't object to the feature per se; no one else seems opposed to the feature per se, in fact. Now, I don't really see _how_ to do it in nice code, so no, I don't have any suggestion for you. You may want to give the pumpkin to Tom so that he gives the patch the finishing touches (hopefully making it support the COPY view feature as well). If it were up to me, I'd just commit it as is (feature-wise -- more thorough review is still needed) and revisit the COPY view stuff in 8.3 if there is demand. OK, I will put it back as it was in v8 keeping all your other cleanup and let Bruce and Tom decide. (BTW, is there anyone as high-ranking as them, or the committee is a duumvirate? :-) ) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Alvaro Herrera írta: Zoltan Boszormenyi wrote: I think at this point is someone else's judgement whether you can put it back or not. Tom already said that he doesn't object to the feature per se; no one else seems opposed to the feature per se, in fact. Now, I don't really see _how_ to do it in nice code, so no, I don't have any suggestion for you. You may want to give the pumpkin to Tom so that he gives the patch the finishing touches (hopefully making it support the COPY view feature as well). If it were up to me, I'd just commit it as is (feature-wise -- more thorough review is still needed) and revisit the COPY view stuff in 8.3 if there is demand. OK, I will put it back as it was in v8 keeping all your other cleanup and let Bruce and Tom decide. Hum, are you going to put back the original cruft to support copy view? I suggest you don't do that. Well, the other way around is to teach heap_open() to use views. Brrr. Would it be any cleaner? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Alvaro Herrera írta: Hum, are you going to put back the original cruft to support copy view? I suggest you don't do that. Well, the other way around is to teach heap_open() to use views. Brrr. Would it be any cleaner? Don't even think of going there ;-) regards, tom lane I didn't. :-) Here's my last, the cruft (i.e. COPY view TO support by rewriting to a SELECT) put back. Tested and docs modified accordingly. You can find the previous one (v10) on the list without it if you need it. Best regards, Zoltán Böszörményi pgsql-copyselect-11.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Bruce Momjian írta: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. Thanks. Would you please add this instead? psql built-in \copy (select ...) now also work. Best regards, Zoltán Böszörményi pgsql-copyselect-8.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] IDENTITY/GENERATED columns
Yes, I am not ready with it. Bruce Momjian írta: This is being done for 8.3, right? --- Zoltan Boszormenyi wrote: Hi, here's the next version. Changes: - Extended documentation - Extending permissions to new sequences ALTER TABLE tab ADD col type GENERATED AS IDENTITY didn't work as advertised, now it seems to. - Test case was also extended. - Previously introduced memory leaks were plugged. Really. Now the only feature missing is the previously discussed GENERATED ALWAYS AS ( expr ) so it can be used like this: CREATE TABLE tab ( c1 double, c2 integer, c3 double GENERATED ALWAYS AS ( col1 + col2), c4 SMALLINT GENERATED ALWAYS AS (CASE WHEN c1 c2 THEN 1 ELSE NULL END) ); What should the following code produce as a result? INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0); This should insert (1.1, 2, 3.1, NULL) UPDATE tab SET c2 = 1; Only c2 changes, so: (1.1, 1, 3.1, NULL) Or should it change to (1.1, 1, 2.1, 1), e.g. recompute all columns that depend on changed columns? UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5; Now what? It should be (3.5, 2, 5.5, 1) But based on current UPDATE behaviour, e.g. values gets computed based on previous values, it becomes (3.5, 2, 2.1, 1) That would really need changing the behaviour of UPDATE. Currently, if I do an UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2; then c3 gets its value based on the previous content of the record. For the above GENERATED ALWAYS AS (expr) construct to work, UPDATE have to compute the column values in multipass, something like this: constant values are computed; while (is there any non-computed columns) { newly_computed = 0; foreach (column, non-computed-columns) { if (column value depends only on computed columns) { compute it; newly_computed++; } } if (newly_computed == 0) elog(ERROR, circular dependency); } This behaviour change would enable something like this: CREATE tab2 (c1 integer, c2 integer, c3 integer); INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2); Does this described behaviour have any precedent or standard compliance? Best regards, Zolt?n B?sz?rm?nyi [ application/x-tar is not supported, skipping... ] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] COPY view
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: How about the callback solution for the SELECT case that was copied from the original? Should I consider open-coding in copy.c what ExecutorRun() does to avoid the callback? Adding a DestReceiver type is a good solution ... although that static variable is not. Instead define a DestReceiver extension struct that can carry the CopyState pointer for you. Done. You could also consider putting the copy-from-view-specific state fields into DestReceiver instead of CopyState, though this is a bit asymmetric with the relation case so maybe it's not really cleaner. Left it alone for now. BTW, lose the tuple_to_values function --- it's an extremely bad reimplementation of heap_deform_tuple. Done. copy_dest_printtup also seems coded without regard for the TupleTableSlot access API (read printtup() to see what to do instead). I am still interpreting it. Can you give me some hints besides using slot_getallattrs(slot)? And what's the point of factoring out the heap_getnext loop as CopyRelationTo? It's not like that lets you share any more code. The inside of the loop, ie what you've called CopyValuesTo, is the sharable part. Done. The option parsing and error checking is now common. I also changed it to use transformStmt() in analyze.c. However, both the UNION and sunselect cases give me something like this: ERROR: could not open relation 1663/16384/16723: No such file or directory What else can I do with it? Best regards, Zoltán Böszörményi pgsql-copyselect-4.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] COPY view
Zoltan Boszormenyi írta: The option parsing and error checking is now common. I also changed it to use transformStmt() in analyze.c. However, both the UNION and sunselect cases give me something like this: ERROR: could not open relation 1663/16384/16723: No such file or directory What else can I do with it? But a single SELECT with two tables joined also works so it must be something trivial. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] COPY view
Zoltan Boszormenyi írta: Zoltan Boszormenyi írta: The option parsing and error checking is now common. I also changed it to use transformStmt() in analyze.c. However, both the UNION and sunselect cases give me something like this: ERROR: could not open relation 1663/16384/16723: No such file or directory What else can I do with it? But a single SELECT with two tables joined also works so it must be something trivial. Now UNIONs and subselects also work. Your concern about copy_dest_printtup() wasn't solved yet. Best regards, Zoltán Böszörményi pgsql-copyselect-5.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] COPY view
Hi, Bruce Momjian írta: Alvaro Herrera wrote: Bruce Momjian wrote: Andrew Dunstan wrote: Bruce Momjian wrote: I think Alvaro is saying we need it in a few days, not longer. I thought he was saying today ;-) He actually said now, but I don't think we need it immediately, especially if he is still working on it. We are at least 1-2 weeks away from having all open patches applied. Yes, I'm saying today so that we can all look at it and point obvious mistakes now, not in 2 weeks from now. Release early, release often. If the patch contains a mistake and we find out in 2 weeks, are we going to fix it? No, we are going to reject it. OK, I understand. B?sz?rm?nyi, post now so we can see where you are, but keep working and send it to us again when you are done. No sense in not posting your working version. OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. BTW, my first name is Zoltán. Best regards, Zoltán Böszörményi pgsql-copyselect.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] COPY view
Zoltan Boszormenyi írta: Hi, Bruce Momjian írta: Alvaro Herrera wrote: Bruce Momjian wrote: Andrew Dunstan wrote: Bruce Momjian wrote: I think Alvaro is saying we need it in a few days, not longer. I thought he was saying today ;-) He actually said now, but I don't think we need it immediately, especially if he is still working on it. We are at least 1-2 weeks away from having all open patches applied. Yes, I'm saying today so that we can all look at it and point obvious mistakes now, not in 2 weeks from now. Release early, release often. If the patch contains a mistake and we find out in 2 weeks, are we going to fix it? No, we are going to reject it. OK, I understand. B?sz?rm?nyi, post now so we can see where you are, but keep working and send it to us again when you are done. No sense in not posting your working version. OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. I meant that UNION selects, subselects don't work yet. BTW, my first name is Zoltán. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] COPY view
Alvaro Herrera írta: Zoltan Boszormenyi wrote: OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. I'm not sure I agree with the approach of creating a fake SELECT * FROM foo in analyze.c in the relation case and passing it back to the parser to create a Query node. That's not there in the original code and you shouldn't need it. Just let the case where COPY gets a relation continue to handle it as it does today, and add a separate case for the SELECT. The exact same code was there, e.g. parse and rewrite SELECT * FROM view just not in analyze.c. I will try without it, though. That doesn't help you with the UNION stuff though. :-( ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] IDENTITY/GENERATED columns
Hi, here's the next version. Changes: - Extended documentation - Extending permissions to new sequences ALTER TABLE tab ADD col type GENERATED AS IDENTITY didn't work as advertised, now it seems to. - Test case was also extended. - Previously introduced memory leaks were plugged. Really. Now the only feature missing is the previously discussed GENERATED ALWAYS AS ( expr ) so it can be used like this: CREATE TABLE tab ( c1 double, c2 integer, c3 double GENERATED ALWAYS AS ( col1 + col2), c4 SMALLINT GENERATED ALWAYS AS (CASE WHEN c1 c2 THEN 1 ELSE NULL END) ); What should the following code produce as a result? INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0); This should insert (1.1, 2, 3.1, NULL) UPDATE tab SET c2 = 1; Only c2 changes, so: (1.1, 1, 3.1, NULL) Or should it change to (1.1, 1, 2.1, 1), e.g. recompute all columns that depend on changed columns? UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5; Now what? It should be (3.5, 2, 5.5, 1) But based on current UPDATE behaviour, e.g. values gets computed based on previous values, it becomes (3.5, 2, 2.1, 1) That would really need changing the behaviour of UPDATE. Currently, if I do an UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2; then c3 gets its value based on the previous content of the record. For the above GENERATED ALWAYS AS (expr) construct to work, UPDATE have to compute the column values in multipass, something like this: constant values are computed; while (is there any non-computed columns) { newly_computed = 0; foreach (column, non-computed-columns) { if (column value depends only on computed columns) { compute it; newly_computed++; } } if (newly_computed == 0) elog(ERROR, circular dependency); } This behaviour change would enable something like this: CREATE tab2 (c1 integer, c2 integer, c3 integer); INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2); Does this described behaviour have any precedent or standard compliance? Best regards, Zoltán Böszörményi psql-serial-26.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] IDENTITY/GENERATED columns
Hi, here's the next, changes are at the end. Zoltan Boszormenyi írta: Hi, my last patch didn't make it to the -hackers list, here's a newer one. Let me list what this patch does now: - CREATE TABLE/ALTER TABLE ADD syntax support for GENERATED { ALWAYS | BY DEFAULT } AS { ( expr ) | IDENTITY ( seq_opts ) } - catalog indicators of the above properties - INSERT|COPY syntax support for OVERRIDING { SYSTEM | USER } VALUE - INSERT|COPY fails for non-owner when using OVERRIDING SYSTEM VALUE on tables with GENERATED ALWAYS columns - UPDATE fails when using other than DEFAULT literal for GENERATED ALWAYS columns - GRANT {INSERT|UPDATE|ALL} ON table automatically gives UPDATE permission for the supporting sequence (missing: ALTER TABLE ADD should give permission for the new sequence) - ALTER TABLE tab ALTER col { SET seq_opts | RESTART [WITH] N } syntax support to alter the supporting sequence - ALTER TABLE tab RENAME also renames the supporting sequence on both table and column renaming - ALTER TABLE SET/DROP default is disallowed - pg_dump support for exporting the above properties including sequence parameters. Data dump uses OVERRIDING SYSTEM VALUE when the table has GENERATED ALWAYS columns - test cases for some operations - documented With this version, I mostly covered these TODO entries: * Add support for SQL-standard GENERATED/IDENTITY columns (almost done :-) ) * %Disallow changing DEFAULT expression of a SERIAL column? (done) * %Disallow ALTER SEQUENCE changes for SERIAL sequences because pg_dump does not dump the changes (pg_dump dumps the sequence options) * Add DEFAULT .. AS OWNER so permission checks are done as the table owner This would be useful for SERIAL nextval() calls and CHECK constraints. (GRANT TABLE grants UPDATE on the supporting sequence, ALTER TABLE ADD COLUMN will be implemented) * %Have ALTER TABLE RENAME rename SERIAL sequence names (done) * Allow SERIAL sequences to inherit permissions from the base table? (not needed because INSERT is a fastpath, permissions added with GRANT TABLE, and [will be] extended on ALTER TABLE ADD COLUMN) I am considering using ALTER TABLE tab ALTER col TYPE to add (and drop) GENERATED ALWAYS and GENERATED AS IDENTITY properties to existing columns. As it stands now, ALTER TYPE doesn't change these column attributes. Please, review. Best regards, Zoltán Böszörményi Changes from previous version: - ALTER TABLE ADD col type GENERATED AS IDENTITY adds permission over the newly created sequence to those who already have INSERT or UPDATE on the table. - ALTER TABLE SET/DROP DEFAULT also disallowed on GENERATED ALWAYS columns - some codepaths were consolidated to avoid source bloat A remaining larger problem is that GENERATED ALWAYS AS (expr) should allow column references from the same table, etc. just like CHECK. For the sake of generality, DEFAULT must have the same features, too. E.g. evaluating DEFAULT value should be done as a second phase, after evaluating other columns with given constants or from DEFAULT that doesn't depend on other columns. Circular dependencies must be avoided, etc. I talked about implementin setting and dropping GENERATED / IDENTITY properties. I cooked up this syntax: ALTER TABLE tab ALTER col SET GENERATED ALWAYS AS ( expr ) Behaves the same as SET DEFAULT expr but sets attforceddef. ALTER TABLE tab ALTER col SET GENERATED ALWAYS AS IDENTITY [( seq_opts )] It should create supporting sequence iff the column isn't already an IDENTITY column. ALTER TABLE tab ALTER col DROP IDENTITY It should be spelled loud instead of DROP DEFAULT. That leaves ALTER TABLE ALTER TYPE alone. Then a short check showed that changing such attributes on a column isn't defined by SQL2003. The nice thing about not implementing the above is that it avoids the debate whether SET GENERATED ALWAYS AS [IDENTITY] should also rewrite the records. My answer in the debate would be that GENERATED ALWAYS AS (expr) should naturally do that but it isn't clear at all for GENERATED ALWAYS AS IDENTITY Closing words: I think it is now ready for acceptance, it does all what I wanted it to do and it conforms to SQL2003. Maybe I introduced some memory leaks but I tried to avoid it by carefully inspecting what other functions do. And some documentation details can also be improved. But I think the code is ready. Please, review. Thanks and best regards, Zoltán Böszörményi psql-serial-24.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PATCHES] IDENTITY/GENERATED columns
Hi, my last patch didn't make it to the -hackers list, here's a newer one. Let me list what this patch does now: - CREATE TABLE/ALTER TABLE ADD syntax support for GENERATED { ALWAYS | BY DEFAULT } AS { ( expr ) | IDENTITY ( seq_opts ) } - catalog indicators of the above properties - INSERT|COPY syntax support for OVERRIDING { SYSTEM | USER } VALUE - INSERT|COPY fails for non-owner when using OVERRIDING SYSTEM VALUE on tables with GENERATED ALWAYS columns - UPDATE fails when using other than DEFAULT literal for GENERATED ALWAYS columns - GRANT {INSERT|UPDATE|ALL} ON table automatically gives UPDATE permission for the supporting sequence (missing: ALTER TABLE ADD should give permission for the new sequence) - ALTER TABLE tab ALTER col { SET seq_opts | RESTART [WITH] N } syntax support to alter the supporting sequence - ALTER TABLE tab RENAME also renames the supporting sequence on both table and column renaming - ALTER TABLE SET/DROP default is disallowed - pg_dump support for exporting the above properties including sequence parameters. Data dump uses OVERRIDING SYSTEM VALUE when the table has GENERATED ALWAYS columns - test cases for some operations - documented With this version, I mostly covered these TODO entries: * Add support for SQL-standard GENERATED/IDENTITY columns (almost done :-) ) * %Disallow changing DEFAULT expression of a SERIAL column? (done) * %Disallow ALTER SEQUENCE changes for SERIAL sequences because pg_dump does not dump the changes (pg_dump dumps the sequence options) * Add DEFAULT .. AS OWNER so permission checks are done as the table owner This would be useful for SERIAL nextval() calls and CHECK constraints. (GRANT TABLE grants UPDATE on the supporting sequence, ALTER TABLE ADD COLUMN will be implemented) * %Have ALTER TABLE RENAME rename SERIAL sequence names (done) * Allow SERIAL sequences to inherit permissions from the base table? (not needed because INSERT is a fastpath, permissions added with GRANT TABLE, and [will be] extended on ALTER TABLE ADD COLUMN) I am considering using ALTER TABLE tab ALTER col TYPE to add (and drop) GENERATED ALWAYS and GENERATED AS IDENTITY properties to existing columns. As it stands now, ALTER TYPE doesn't change these column attributes. Please, review. Best regards, Zoltán Böszörményi psql-serial-21.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 6: explain analyze is your friend