Re: [HACKERS] ECPG FETCH readahead
2014-04-16 10:54 keltezéssel, Boszormenyi Zoltan írta: 2014-01-18 18:08 keltezéssel, Boszormenyi Zoltan írta: Hi, Alvaro Herrera wrote: Boszormenyi Zoltan escribió: Rebased patches after the regression test and other details were fixed in the infrastructure part. This thread started in 2010, and various pieces have been applied already and some others have changed in nature. Would you please post a new patchset, containing rebased patches that still need application, in a new email thread to be linked in the commitfest entry? I hope Thunderbird did the right thing and didn't include the reference message ID when I told it to edit as new. So supposedly this is a new thread but with all the cc: addresses kept. I have rebased all remaining patches and kept the numbering. All the patches from 18 to 25 that were previously in the ECPG infrastructure CF link are included here. There is no functional change. Because of the recent bugfixes in the ECPG area, the patchset didn't apply cleanly anymore. Rebased with no functional change. The patches themselves are a little bigger though. It's because the patches are generated with 10 lines of context, this was set in my $HOME/.gitconfig: [diff] context = 10 Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes
2014-01-16 23:42 keltezéssel, Alvaro Herrera írta: Boszormenyi Zoltan escribió: Rebased patches after the regression test and other details were fixed in the infrastructure part. This thread started in 2010, and various pieces have been applied already and some others have changed in nature. Would you please post a new patchset, containing rebased patches that still need application, in a new email thread to be linked in the commitfest entry? I will do that. Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes
Boszormenyi Zoltan escribió: Rebased patches after the regression test and other details were fixed in the infrastructure part. This thread started in 2010, and various pieces have been applied already and some others have changed in nature. Would you please post a new patchset, containing rebased patches that still need application, in a new email thread to be linked in the commitfest entry? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes
2013-12-21 14:56 keltezéssel, Peter Eisentraut írta: This patch didn't make it out of the 2013-11 commit fest. You should move it to the next commit fest (probably with an updated patch) before January 15th. Done. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes
This patch didn't make it out of the 2013-11 commit fest. You should move it to the next commit fest (probably with an updated patch) before January 15th. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
ECPG fixes, was: Re: [HACKERS] ECPG FETCH readahead
2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: 2013-11-12 07:01 keltezéssel, Noah Misch írta: On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: The old contents of my GIT repository was removed so you need to clone it fresh. https://github.com/zboszor/ecpg-readahead.git I won't post the humongous patch again, since sending a 90KB compressed file to everyone on the list is rude. Patches of that weight show up on a regular basis. I don't think it's rude. OK, here it is. I have rebased the patchset after ecpg: Split off mmfatal() from mmerror() since it caused merge conflicts. It's at the usual place again, you need to clone it from scratch if you are interested in looking at git diff/log I have removed some previous ecpg_log() debug output and the total patch size is not so huge any more but I am going to post the split-up set in parts. Attached is the first few patches that are strictly generic ECPG fixes. They can be applied independently and obvious enough. Subsequent patches will come as reply to this email. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ commit f167aaa9693305e08cd6b2946af8528dada799b4 Author: Böszörményi Zoltán z...@cybertec.at Date: Wed Nov 20 10:31:21 2013 +0100 ECPG: Make the preprocessor emit ';' if the variable type for a list of variables is varchar. This fixes this test case: int main(void) { exec sql begin declare section; varchar a[50], b[50]; exec sql end declare section; return 0; } Since varchars are internally turned into custom structs and the type name is emitted for these variable declarations, the preprocessed code previously had: struct varchar_1 { ... } a _,_ struct varchar_2 { ... } b ; The comma in the generated C file was a syntax error. There are no regression test changes since it's not exercised. diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer index 342b7bc..fd35dfc 100644 --- a/src/interfaces/ecpg/preproc/ecpg.trailer +++ b/src/interfaces/ecpg/preproc/ecpg.trailer @@ -837,7 +837,12 @@ opt_signed: SQL_SIGNED variable_list: variable { $$ = $1; } | variable_list ',' variable - { $$ = cat_str(3, $1, mm_strdup(,), $3); } + { + if (actual_type[struct_level].type_enum == ECPGt_varchar) +$$ = cat_str(3, $1, mm_strdup(;), $3); + else +$$ = cat_str(3, $1, mm_strdup(,), $3); + } ; variable: opt_pointer ECPGColLabel opt_array_bounds opt_bit_field opt_initializer commit b6d57b3fa709757769eb27083d7231602f2d806c Author: Böszörményi Zoltán z...@cybertec.at Date: Wed Nov 20 10:33:40 2013 +0100 ECPG: Free the malloc()'ed variables in the test so it comes out clean on Valgrind runs. diff --git a/src/interfaces/ecpg/test/expected/preproc-outofscope.c b/src/interfaces/ecpg/test/expected/preproc-outofscope.c index 125d7d8..2438911 100644 --- a/src/interfaces/ecpg/test/expected/preproc-outofscope.c +++ b/src/interfaces/ecpg/test/expected/preproc-outofscope.c @@ -347,28 +347,31 @@ if (sqlca.sqlcode 0) exit (1);} close_cur1(); + free(myvar); + free(mynullvar); + strcpy(msg, drop); { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, drop table a1, ECPGt_EOIT, ECPGt_EORT); -#line 115 outofscope.pgc +#line 118 outofscope.pgc if (sqlca.sqlcode 0) exit (1);} -#line 115 outofscope.pgc +#line 118 outofscope.pgc strcpy(msg, commit); { ECPGtrans(__LINE__, NULL, commit); -#line 118 outofscope.pgc +#line 121 outofscope.pgc if (sqlca.sqlcode 0) exit (1);} -#line 118 outofscope.pgc +#line 121 outofscope.pgc strcpy(msg, disconnect); { ECPGdisconnect(__LINE__, CURRENT); -#line 121 outofscope.pgc +#line 124 outofscope.pgc if (sqlca.sqlcode 0) exit (1);} -#line 121 outofscope.pgc +#line 124 outofscope.pgc return (0); diff --git a/src/interfaces/ecpg/test/expected/preproc-outofscope.stderr b/src/interfaces/ecpg/test/expected/preproc-outofscope.stderr index 91d3505..c7f8771 100644 --- a/src/interfaces/ecpg/test/expected/preproc-outofscope.stderr +++ b/src/interfaces/ecpg/test/expected/preproc-outofscope.stderr @@ -102,13 +102,13 @@ [NO_PID]: sqlca: code: 0, state: 0 [NO_PID]: ecpg_execute on line 58: OK: CLOSE CURSOR [NO_PID]: sqlca: code: 0, state: 0 -[NO_PID]: ecpg_execute on line 115: query: drop table a1; with 0 parameter(s) on connection regress1 +[NO_PID]: ecpg_execute on line 118: query: drop table a1; with 0 parameter(s) on connection regress1 [NO_PID]: sqlca: code: 0, state: 0 -[NO_PID]: ecpg_execute on line 115: using PQexec +[NO_PID]: ecpg_execute on line 118: using PQexec [NO_PID]: sqlca: code: 0, state: 0 -[NO_PID]: ecpg_execute on line 115: OK: DROP TABLE +[NO_PID]: ecpg_execute on line
Re: [HACKERS] ECPG FETCH readahead
2013-10-11 00:16 keltezéssel, Alvaro Herrera írta: Boszormenyi Zoltan escribió: 2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: You need to update the dblink regression tests. Done. Dude, this is an humongous patch. I was shocked by it initially, but on further reading, I observed that it's only a huge patch which also does some mechanical changes to test output. I think it'd be better to split the part that's responsible for the changed lines in test output mentioning ecpg_process_output. That should be a reasonably small patch which changes ecpg_execute slightly and adds the new function, is followed by the enormous resulting mechanical changes in test output. It should be possible to commit that relatively quickly. Then there's the rest of the patch, which would adds a huge pile of new code. I think there are some very minor changes to backend code as well -- would it make sense to post that as a separate piece? I had to rebase the patch against current (today morning's) GIT, since there were a few changes against ECPG in the meantime. The old contents of my GIT repository was removed so you need to clone it fresh. https://github.com/zboszor/ecpg-readahead.git I won't post the humongous patch again, since sending a 90KB compressed file to everyone on the list is rude. You can pull the commits individually from the above repository. For the same reason, I won't add the DECLARE CURSOR command tag change separately since this is also part of this big feature. I have reordered some patches, like some independent bug fixes against the ECPG parser and regression tests. The backend change is also added early. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: The old contents of my GIT repository was removed so you need to clone it fresh. https://github.com/zboszor/ecpg-readahead.git I won't post the humongous patch again, since sending a 90KB compressed file to everyone on the list is rude. Patches of that weight show up on a regular basis. I don't think it's rude. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2013-10-11 00:16 keltezéssel, Alvaro Herrera írta: Boszormenyi Zoltan escribió: 2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: You need to update the dblink regression tests. Done. Dude, this is an humongous patch. You *know* that the patch is available in pieces at https://github.com/zboszor/ecpg-readahead right? You must have read the new initial announcement at http://archives.postgresql.org/message-id/520f4b90.2010...@cybertec.at You can review and merge them one by one. The transformation of ecpg_execute() and friends starts at 2d04ff83984c63c34e55175317e3e431eb58e00c I was shocked by it initially, but on further reading, I observed that it's only a huge patch which also does some mechanical changes to test output. I think it'd be better to split the part that's responsible for the changed lines in test output mentioning ecpg_process_output. It's 1e0747576e96aae3ec8c60c46baea130aaf8916e in the above repository. Also, another huge regression test patch is where the cursor readahead is enabled unconditionally: 27e43069082b29cb6fa4d3414e6484ec7fb80cbe That should be a reasonably small patch which changes ecpg_execute slightly and adds the new function, is followed by the enormous resulting mechanical changes in test output. It should be possible to commit that relatively quickly. Then there's the rest of the patch, which would adds a huge pile of new code. I think there are some very minor changes to backend code as well -- would it make sense to post that as a separate piece? It's 2ad207e6371a33d6a985c76ac066dd51ed5681cb Also, cd881f0363b1aff1508cfa347e8df6b4981f0ee7 to fix the dblink regression test. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Boszormenyi Zoltan escribió: 2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: You need to update the dblink regression tests. Done. Dude, this is an humongous patch. I was shocked by it initially, but on further reading, I observed that it's only a huge patch which also does some mechanical changes to test output. I think it'd be better to split the part that's responsible for the changed lines in test output mentioning ecpg_process_output. That should be a reasonably small patch which changes ecpg_execute slightly and adds the new function, is followed by the enormous resulting mechanical changes in test output. It should be possible to commit that relatively quickly. Then there's the rest of the patch, which would adds a huge pile of new code. I think there are some very minor changes to backend code as well -- would it make sense to post that as a separate piece? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Proposal: UPDATE/DELETE ... WHERE OFFSET n OF cursor_name, was: Re: New ECPG idea, was: Re: [HACKERS] ECPG FETCH readahead
Hi, 2013-08-17 13:02 keltezéssel, Boszormenyi Zoltan írta: [snip, discussion of WHERE CURRENT OF in the ECPG client lib] I had a second thought about it and the client side caching and parser behind the application's back seems to be an overkill. Instead, I propose a different solution, which is a logical extension of FETCH { FORWARD | BACKWARD } N, which is a PostgreSQL extension. The proposed solution would be: UPDATE / DELETE ... WHERE OFFSET SignedIconst OF cursor_name I imagine that FETCH would keep the array of TIDs/ItemPointerDatas of the last FETCH statement. The argument to OFFSET would be mostly in negative terms, with 0 being equivalent of WHERE CURRENT OF. E.g.: FETCH 2 FROM mycur; -- fetches two rows UPDATE mytab SET ... WHERE OFFSET -1 OF mycur; -- updates the first row UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- updates current row or FETCH 3 FROM mycur; -- fetches two rows, reaches end of the cursor UPDATE mytab SET ... WHERE OFFSET -2 OF mycur; -- updates the first row UPDATE mytab SET ... WHERE OFFSET -1 OF mycur; -- updates the second row UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- throws an error like WHERE CURRENT OF or FETCH 3 FROM mycur; -- fetches two rows, reaches end of the cursor MOVE BACKWARD 2 IN mycur; UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- updates the first row (now current) UPDATE mytab SET ... WHERE OFFSET 1 OF mycur; -- updates the second row The cached array can be kept valid until the next FETCH statement, even if moves out of the interval of the array, except in case the application changes the sign of the cursor position, e.g. previously it used MOVE ABSOLUTE with positive numbers and suddenly it switches to backward scanning with MOVE ABSOLUTE negative number or vice-versa. This would solve the only source of slowdown in the client side cursor caching in ECPG present in my current ECPG cursor readahead patch, there would be no more MOVE + UPDATE/DELETE WHERE CURRENT OF. On the other hand, exploiting this proposed feature in ECPG would make it incompatible with older servers unless it detects the server version it connects to and uses the current method. Comments? Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
You need to update the dblink regression tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Wed, 2013-09-04 at 10:06 +0200, Boszormenyi Zoltan wrote: 2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta: I have put the broken up patchset into a GIT tree of mine at GitHub: https://github.com/zboszor/ecpg-readahead/ but the huge compressed patch is also attached for reference. I merged current PG GIT HEAD in the above tree and fixed a merge conflict caused by commit 673b527534893a4a8adb3cdef52fc645c13598ce The huge patch is attached for reference. The documentation doesn't build: openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was specified openjade:ecpg.sgml:477:40: start tag was here openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was specified openjade:ecpg.sgml:477:20: start tag was here openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was specified openjade:ecpg.sgml:473:81: start tag was here openjade:ecpg.sgml:478:8:E: end tag for LITERAL omitted, but OMITTAG NO was specified openjade:ecpg.sgml:473:56: start tag was here -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
New ECPG idea, was: Re: [HACKERS] ECPG FETCH readahead
2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta: Hi, I am restarting this old thread... :-) 2012-04-24 10:17 keltezéssel, Michael Meskes írta: OK, I will implement #2. Another question popped up: what to do with FETCH ALL? The current readahead window size or temporarily bumping it to say some tens of thousands can be used. We may not know how much is the all records. This, although lowers performance, saves memory. I would say doing a large fetch in two or three batches won't cost too much in terms of performance. Please, don't apply this patch yet. I discovered a rather big hole that can confuse the cursor position tracking if you do this: ... That will also need a new round of review. Sorry for that. No problem, better to find it now instead of after release. Anyway, I moved the patch to 2012-next (I hope I did it correctly) so 2012-1 can be closed. Let's try to get this patch done in the next commit fest. Michael I had time to look into this patch of mine again after about 1.5 years. Frankly, this time was too long to remember every detail of the patch and looking at parts of the patch as a big entity was confusing. So I started fresh and to make review easier, I broke the patch up into small pieces that all build on each other. I have also fixed quite a few bugs, mostly in my code, but some in the ECPG parser and the regression tests as well. I have put the broken up patchset into a GIT tree of mine at GitHub: https://github.com/zboszor/ecpg-readahead/ but the huge compressed patch is also attached for reference. It was generated with $ git diff 221e92f64c6e136e550ec2592aac3ae0d4623209..870922676e6ae0faa4ebbf94b92e0b97ec418e16 ECPG regression tests are now Valgrind-clean except two of them but both are pre-existing bugs. 1. ecpg/test/compat_informix/rfmtlong.pgc points out a problem in ecpg/compatlib/informix.c ==5036== 1 errors in context 1 of 4: ==5036== Invalid read of size 4 ==5036==at 0x4E3453C: rfmtlong (informix.c:941) ==5036==by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) ==5036==by 0x4006BE: main (rfmtlong.pgc:45) ==5036== Address 0x60677d8 is 24 bytes inside a block of size 25 alloc'd ==5036==at 0x4C28409: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==5036==by 0x4E34268: rfmtlong (informix.c:783) ==5036==by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) ==5036==by 0x4006BE: main (rfmtlong.pgc:45) The same error is reported 4 times. 2. ecpg_add_mem() seems to leak memory: ==5463== 256 bytes in 16 blocks are definitely lost in loss record 1 of 1 ==5463==at 0x4C2A121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==5463==by 0x4E3E153: ecpg_alloc (memory.c:21) ==5463==by 0x4E3E212: ecpg_add_mem (memory.c:110) ==5463==by 0x4E3542B: ecpg_store_result (execute.c:409) ==5463==by 0x4E37E5A: ecpg_process_output (execute.c:1777) ==5463==by 0x4E38CCA: ecpg_do (execute.c:2137) ==5463==by 0x4E38D8A: ECPGdo (execute.c:2159) ==5463==by 0x400A82: fn (alloc.pgc:51) ==5463==by 0x5152C52: start_thread (pthread_create.c:308) ==5463==by 0x545C13C: clone (clone.S:113) The last two issue we talked about in this thread are also implemented: - permanently raise the readahead window if the application sends a bigger FETCH command, and - temporarily raise the readahead window for FETCH ALL commands The cursor position tracking was completely rewritten, so the client side properly follows the cursor position known by the backend and doesn't skip MOVE statements where it shouldn't. The previously known bug is completely eliminated this way. Please, review that patch. I have another idea to make ECPG building on this huge patch. Currently, UPDATE/DELETE WHERE CURRENT OF has to issue a MOVE before the command in case the cursor positions known by the application and the backend are different. My idea builds on the fact that UPDATE/DELETE RETURNING is present in all supported back branches. A mini-parser only understanding SELECT, UPDATE and DELETE should be added to ecpglib, so DECLARE cursor CURSOR FOR SELECT ... and PREPARE prepared_stmt FROM :query; DECLARE cursor CURSOR FOR prepared_stmt; can be analyzed and tweaked behind the application's back. This is needed to detect whether a query is a simple updatable scan of a table, and returning errors early to the application if it's not, without actually sending the UPDATE/DELETE WHERE CURRENT OF query to the backend. For the purpose of WHERE CURRENT OF, I would add a ctid column at the end of the targelist that is treated like resjunk in the backend when returning data to the application. So, SELECTs would return the ctid information of the tuples. The cursor query was a FETCH N with abs(N)1 because of the readahead. For this reason, the cursor positions known by the application and the backend are different. The extra MOVE can be eliminated by replacing UPDATE table SET ... WHERE CURRENT
Re: [HACKERS] ECPG FETCH readahead
OK, I will implement #2. Another question popped up: what to do with FETCH ALL? The current readahead window size or temporarily bumping it to say some tens of thousands can be used. We may not know how much is the all records. This, although lowers performance, saves memory. I would say doing a large fetch in two or three batches won't cost too much in terms of performance. Please, don't apply this patch yet. I discovered a rather big hole that can confuse the cursor position tracking if you do this: ... That will also need a new round of review. Sorry for that. No problem, better to find it now instead of after release. Anyway, I moved the patch to 2012-next (I hope I did it correctly) so 2012-1 can be closed. Let's try to get this patch done in the next commit fest. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Hi, 2012-04-17 06:48 keltezéssel, Michael Meskes írta: On Tue, Apr 17, 2012 at 06:02:34AM +0200, Boszormenyi Zoltan wrote: I listed two scenarios. 1. occasional bump of the readahead window for large requests, for smaller requests it uses the originally set size 2. permanent bump of the readahead window for large requests (larger than previously seen), all subsequent requests use the new size Both can be implemented easily, which one do you prefer? If you always use very large requests, 1) behaves like 2) I'd say let's go for #2. #1 is probably more efficient but not what the programmer asked us to do. After all it's easy to increase the window size accordingly if you want so as a programmer. Michael OK, I will implement #2. Another question popped up: what to do with FETCH ALL? The current readahead window size or temporarily bumping it to say some tens of thousands can be used. We may not know how much is the all records. This, although lowers performance, saves memory. Please, don't apply this patch yet. I discovered a rather big hole that can confuse the cursor position tracking if you do this: DECLARE mycur; MOVE ABSOLUTE n IN mycur; MOVE BACKWARD m IN mycur; If (n+m) is greater, but (n-m) is smaller than the number of rows in the cursor, the backend's and the caching code's ideas about where the cursor is will differ. I need to fix this before it can be applied. That will also need a new round of review. Sorry for that. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Mon, Apr 16, 2012 at 06:24:57AM +0200, Boszormenyi Zoltan wrote: Yes, just like when the readahead window set to 256, FETCH 1024 will iterate through 4 windows or FETCH 64 iterates through the same window 4 times. This is the idea behind the readahead window. Really? It's definitely not the idea behind FETCH 1024. Using the same window 4 times for FETCH 64 is the idea though, I agree. How about allowing the readahead window to be resized for the non-decorated case if the runtime encounters FETCH N and N is greater than the previous window? To be resized by what? IMO a FETCH N should always be a FETCH N, no matter what, i.e. if the readahead window is larger, use it, but even if it's smaller we should still fetch N at the same time. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-16 18:04 keltezéssel, Michael Meskes írta: On Mon, Apr 16, 2012 at 06:24:57AM +0200, Boszormenyi Zoltan wrote: Yes, just like when the readahead window set to 256, FETCH 1024 will iterate through 4 windows or FETCH 64 iterates through the same window 4 times. This is the idea behind the readahead window. Really? It's definitely not the idea behind FETCH 1024. Using the same window 4 times for FETCH 64 is the idea though, I agree. OK. I would like to stretch your agreement a little. :-) Can we state that caching means that if the cache should serve the incoming request(s) until the request spills out of it? If your answer to the above is yes, then please consider this case: - readahead window is 256 (via ECPGFETCHSZ) - FETCH 64 was executed twice, so you are in the middle of the cache - FETCH 1024 is requested So, if I understand you correctly, you expect this scenario: - set a one-time readahead window size ( N - # of rows that can be served = 1024 - 128 = 768) so the next FETCH by the runtime will fullfill this request fully - serve the request's first 128 rows from the current cache - for the 129th row, FETCH 768 will be executed - all subsequent requests use the old readahead size How about allowing the readahead window to be resized for the non-decorated case if the runtime encounters FETCH N and N is greater than the previous window? To be resized by what? By the new FETCH request. Instead of the above, I imagined this: - the runtime notices that the new request is larger than the current readahead window size, modifies the readahead window size upwards, so the next FETCH will use it - serve the request's first 128 rows from the current cache - for the 129th row, FETCH 1024 will be executed and the remaining 768 rows will be served from the new cache - all subsequent requests use the new readahead size, 1024 IMO a FETCH N should always be a FETCH N, no matter what This part of your statement contradicts with caching. :-) , i.e. if the readahead window is larger, use it, but even if it's smaller we should still fetch N at the same time. So, there can be occasional one-time larger requests but smaller ones should apply the set window size, right? Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Mon, Apr 16, 2012 at 07:18:07PM +0200, Boszormenyi Zoltan wrote: OK. I would like to stretch your agreement a little. :-) ... Yeah, you got a point here. By the new FETCH request. Instead of the above, I imagined this: - the runtime notices that the new request is larger than the current readahead window size, modifies the readahead window size upwards, so the next FETCH will use it - serve the request's first 128 rows from the current cache - for the 129th row, FETCH 1024 will be executed and the remaining 768 rows will be served from the new cache That means window size goes up to 1024-128 for that one case? - all subsequent requests use the new readahead size, 1024 Sounds reasonable to me. So, there can be occasional one-time larger requests but smaller ones should apply the set window size, right? Yes. I do agree that FETCH N cannot fetch N all the time, but please make it work like what you suggested to make sure people don't have to recompile. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-17 05:52 keltezéssel, Michael Meskes írta: On Mon, Apr 16, 2012 at 07:18:07PM +0200, Boszormenyi Zoltan wrote: OK. I would like to stretch your agreement a little. :-) ... Yeah, you got a point here. By the new FETCH request. Instead of the above, I imagined this: - the runtime notices that the new request is larger than the current readahead window size, modifies the readahead window size upwards, so the next FETCH will use it - serve the request's first 128 rows from the current cache - for the 129th row, FETCH 1024 will be executed and the remaining 768 rows will be served from the new cache That means window size goes up to 1024-128 for that one case? I listed two scenarios. 1. occasional bump of the readahead window for large requests, for smaller requests it uses the originally set size 2. permanent bump of the readahead window for large requests (larger than previously seen), all subsequent requests use the new size Both can be implemented easily, which one do you prefer? If you always use very large requests, 1) behaves like 2) - all subsequent requests use the new readahead size, 1024 Sounds reasonable to me. So, there can be occasional one-time larger requests but smaller ones should apply the set window size, right? Yes. I do agree that FETCH N cannot fetch N all the time, but please make it work like what you suggested to make sure people don't have to recompile. Michael -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Tue, Apr 17, 2012 at 06:02:34AM +0200, Boszormenyi Zoltan wrote: I listed two scenarios. 1. occasional bump of the readahead window for large requests, for smaller requests it uses the originally set size 2. permanent bump of the readahead window for large requests (larger than previously seen), all subsequent requests use the new size Both can be implemented easily, which one do you prefer? If you always use very large requests, 1) behaves like 2) I'd say let's go for #2. #1 is probably more efficient but not what the programmer asked us to do. After all it's easy to increase the window size accordingly if you want so as a programmer. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Tue, Apr 10, 2012 at 07:56:35PM +0200, Boszormenyi Zoltan wrote: With the above, it would be possible to use a comma separated list of -r suboptions, e.g. -r prepare,questionmarks,readahead=16 in one option. Yes, that sounds like a good plan. But of course it's outside the scope of this patch, so we can add this later on. - Also added a note to the documentation about a possible performance trap if a previously written ECPG application uses its own custom readahead via multi-row FETCH statements. I didn't know that before you send this patch. Noah, did you? Frankly, I don't like this at all. If I got it right that means a FETCH N is essantially computed as N times FETCH 1 unless you either add a non-standard option to the DECLARE statement or you add a command-line option to ecpg. Did I get that right? If so we would deliberately make ecpglib work incorrectly and remove performance. Why is that? I'm interested in what others think, but to me that sounds like a show-stopper. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-16 04:46 keltezéssel, Michael Meskes írta: On Tue, Apr 10, 2012 at 07:56:35PM +0200, Boszormenyi Zoltan wrote: With the above, it would be possible to use a comma separated list of -r suboptions, e.g. -r prepare,questionmarks,readahead=16 in one option. Yes, that sounds like a good plan. But of course it's outside the scope of this patch, so we can add this later on. - Also added a note to the documentation about a possible performance trap if a previously written ECPG application uses its own custom readahead via multi-row FETCH statements. I didn't know that before you send this patch. Noah, did you? Frankly, I don't like this at all. If I got it right that means a FETCH N is essantially computed as N times FETCH 1 unless you either add a non-standard option to the DECLARE statement or you add a command-line option to ecpg. Did I get that right? Yes, just like when the readahead window set to 256, FETCH 1024 will iterate through 4 windows or FETCH 64 iterates through the same window 4 times. This is the idea behind the readahead window. If so we would deliberately make ecpglib work incorrectly and remove performance. Why is that? I'm interested in what others think, but to me that sounds like a show-stopper. How about allowing the readahead window to be resized for the non-decorated case if the runtime encounters FETCH N and N is greater than the previous window? Michael -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-08 19:38 keltezéssel, Michael Meskes írta: On Sun, Apr 08, 2012 at 06:35:33PM +0200, Boszormenyi Zoltan wrote: Do you want me to change this or will you do it? I am on holiday and will be back to work on wednesday. I don't think waiting till later this week is a real problem. OK. The possibility to test different readahead window sizes without modifying the source and recompiling was useful. Sure, but you can still do that when not defining a fixed number in the statement. OK. The -R option simply provides a default without ornamenting the DECLARE statement. Could you please incorporate these changes, too, when you're back from vacation? Sure. So, it's established that a specified READAHEAD N should not be overridden. Even an explicit READAHEAD 1. Only a non-decorated cursor can be overridden, even if a different default readahead window size is specified with e.g. ecpg -R 8. If ECPGFETCHSZ is not present, 8 will be used, if ECPGFETCHSZ is present, its value will be used. ECPGopen() will need an extra bool argument to distinguish this. Is this acceptable? Noah, Michael? I cannot find a test that tests the environment variable giving the fetch size. Could you please point me to that? I didn't write such a test. The reason is that while variables are exported by make from the Makefile to the binaries run by make e.g. CFLAGS et.al. for $(CC), make check simply runs pg_regress once which uses its own configuration file that doesn't have a way to set or unset an environment variable. This could be a useful extension to pg_regress though. How about calling setenv() from the test program itself? Sure, I didn't think about it. It should be done before the first EXEC SQL OPEN cursor. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Tue, Apr 10, 2012 at 09:35:21AM +0200, Boszormenyi Zoltan wrote: So, it's established that a specified READAHEAD N should not be overridden. Even an explicit READAHEAD 1. Only a non-decorated cursor can be overridden, even if a different default readahead window size is specified with e.g. ecpg -R 8. If ECPGFETCHSZ is not present, 8 will be used, if ECPGFETCHSZ is present, its value will be used. ECPGopen() will need an extra bool argument to distinguish this. Is this acceptable? Noah, Michael? Sounds perfect. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Tue, Apr 10, 2012 at 10:37:22AM -0400, Noah Misch wrote: Only a non-decorated cursor can be overridden, even if a different default readahead window size is specified with e.g. ecpg -R 8. If ECPGFETCHSZ is not present, 8 will be used, if ECPGFETCHSZ is present, its value will be used. ECPGopen() will need an extra bool argument to distinguish this. Is this acceptable? Noah, Michael? Sounds perfect. Fine by me. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-10 16:55 keltezéssel, Michael Meskes írta: On Tue, Apr 10, 2012 at 10:37:22AM -0400, Noah Misch wrote: Only a non-decorated cursor can be overridden, even if a different default readahead window size is specified with e.g. ecpg -R 8. If ECPGFETCHSZ is not present, 8 will be used, if ECPGFETCHSZ is present, its value will be used. ECPGopen() will need an extra bool argument to distinguish this. Is this acceptable? Noah, Michael? Sounds perfect. Fine by me. Michael OK. Next question: now that both patches are intended to be applied, should I send a unified single patch that contains the previous functionality and the required fixes or a new one that only contains the last required fixes? Thanks in advance, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Tue, Apr 10, 2012 at 05:24:55PM +0200, Boszormenyi Zoltan wrote: OK. Next question: now that both patches are intended to be applied, should I send a unified single patch that contains the previous functionality and the required fixes or a new one that only contains the last required fixes? I'm fine with whatever is easier for you. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-10 17:34 keltezéssel, Michael Meskes írta: On Tue, Apr 10, 2012 at 05:24:55PM +0200, Boszormenyi Zoltan wrote: OK. Next question: now that both patches are intended to be applied, should I send a unified single patch that contains the previous functionality and the required fixes or a new one that only contains the last required fixes? I'm fine with whatever is easier for you. Michael I guess the second option is easier for all of us because reviewing it doesn't invalidate the previous ones. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Hi, 2012-04-10 16:55 keltezéssel, Michael Meskes írta: On Tue, Apr 10, 2012 at 10:37:22AM -0400, Noah Misch wrote: Only a non-decorated cursor can be overridden, even if a different default readahead window size is specified with e.g. ecpg -R 8. If ECPGFETCHSZ is not present, 8 will be used, if ECPGFETCHSZ is present, its value will be used. ECPGopen() will need an extra bool argument to distinguish this. Is this acceptable? Noah, Michael? Sounds perfect. Fine by me. Michael you commented on two new options were added and they should be suboptions to -r. I looked at man getopt_long to see what I can do about the -R option and there seems to be a getsubopt() call which is an extension to getopt_long. My manpage under Fedora 16 says this: NAME getsubopt - parse suboption arguments from a string SYNOPSIS #include stdlib.h int getsubopt(char **optionp, char * const *tokens, char **valuep); Feature Test Macro Requirements for glibc (see feature_test_macros(7)): getsubopt(): _XOPEN_SOURCE = 500 || _XOPEN_SOURCE _XOPEN_SOURCE_EXTENDED || /* Since glibc 2.12: */ _POSIX_C_SOURCE = 200809L I wonder whether the manual parsing of -r suboptions may be rewritten using this function or PostgreSQL supports systems without the above X/Open or POSIX support levels. Anyway, to make it possible to rewrite using the above call, I modified -R and it's now -r readahead=number. Documentation is adjusted. With the above, it would be possible to use a comma separated list of -r suboptions, e.g. -r prepare,questionmarks,readahead=16 in one option. Summary of other changes: - The result set size detection is a suboption of -r, documentation is adjusted. - Only undecorated cursors use ECPGFETCHSZ, documentation is adjusted - ecpg --help says ...default 0 (disabled)... fixed. - Comment in cursor-readahead.pgc is fixed. - New regression test that exercises ECPGFETCHSZ=8 and a non-readahead cursor. The stderr file shows the fetch forward 8 executed by the runtime. - Also added a note to the documentation about a possible performance trap if a previously written ECPG application uses its own custom readahead via multi-row FETCH statements. This patch should be applied over the two patches I last sent. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ ecpg-cursor-readahead-fixes-v3.patch.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sat, Apr 07, 2012 at 11:50:42AM -0400, Noah Misch wrote: Both. The second patch appeared after my first review, based on a comment in that review. I looked at it during my re-review before marking the overall project Ready for Committer. Thanks. I do call your attention to a question I raised in my second review: if a program contains DECLARE foo READAHEAD 5 CURSOR FOR ... and the user runs the program with ECPGFETCHSZ=10 in the environment, should that cursor use a readahead window of 5 or of 10? Original commentary: http://archives.postgresql.org/message-id/20120329004323.ga17...@tornado.leadboat.com I'd say it should be 5. I don't like an environment variable overwriting a hard-coded setting. I think this is what you, Noah, thought, too, right? I'd say let's change this. Is it possible to allow just READAHEAD without a number? In that case I would accept the environment variable. And some comments mostly directed at Zoltan: ecpg --help says ...default 0 (disabled)..., but options less than 1 are not accepted and the default setting of 1 has a comment Disabled by default. I guess this needs to be adjusted. Is there a reason why two new options for ecpg were invented? Normally ecpg options define how the preprocessor works but not the resulting binary. Well, different preprocessor behaviour might result in different binary behaviour of course. The only option that only effects the resulting binary is -r for runtime. Again, this is not completely true as the option has to make its way into the binary, but that's it. Now I wonder whether it would make more sense to add the two options as runtime options instead. The --detect-cursor-resultset-size option should work there without a problem. I haven't delved into the source code enough to find out if -R changes something in the compiler stage. The test case cursor-readahead.pgc has a comment saying test automatic prepare for all statements. Copy/Paste error? I cannot find a test that tests the environment variable giving the fetch size. Could you please point me to that? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-04-08 16:25 keltezéssel, Michael Meskes írta: On Sat, Apr 07, 2012 at 11:50:42AM -0400, Noah Misch wrote: Both. The second patch appeared after my first review, based on a comment in that review. I looked at it during my re-review before marking the overall project Ready for Committer. Thanks. I do call your attention to a question I raised in my second review: if a program contains DECLARE foo READAHEAD 5 CURSOR FOR ... and the user runs the program with ECPGFETCHSZ=10 in the environment, should that cursor use a readahead window of 5 or of 10? Original commentary: http://archives.postgresql.org/message-id/20120329004323.ga17...@tornado.leadboat.com I'd say it should be 5. I don't like an environment variable overwriting a hard-coded setting. I think this is what you, Noah, thought, too, right? I'd say let's change this. Do you want me to change this or will you do it? I am on holiday and will be back to work on wednesday. The possibility to test different readahead window sizes without modifying the source and recompiling was useful. Is it possible to allow just READAHEAD without a number? In that case I would accept the environment variable. After the 2nd patch is applied, this is exactly the case. The cursors are driven and accounted using the new functions but with the readahead window being a single row as the default value for fetch_readahead is 1. And some comments mostly directed at Zoltan: ecpg --help says ...default 0 (disabled)..., but options less than 1 are not accepted and the default setting of 1 has a comment Disabled by default. I guess this needs to be adjusted. Yes, the help text was not changed in the 2nd patch, I missed that. Is there a reason why two new options for ecpg were invented? Normally ecpg options define how the preprocessor works but not the resulting binary. The -R option simply provides a default without ornamenting the DECLARE statement. Well, different preprocessor behaviour might result in different binary behaviour of course. The only option that only effects the resulting binary is -r for runtime. Again, this is not completely true as the option has to make its way into the binary, but that's it. Now I wonder whether it would make more sense to add the two options as runtime options instead. The --detect-cursor-resultset-size option should work there without a problem. You are right. This can be a suboption to -r. I haven't delved into the source code enough to find out if -R changes something in the compiler stage. -R works just like -r in the sense that a value gets passed to the runtime. -R simply changes the default value that gets passed if no READAHEAD N clause is specified for a cursor. This is true only if you intend to apply both paches. Without the 2nd patch and fetch_readahead=0 (no -R option given) or NO READAHEAD is specified for a cursor, the compiler makes a distinction between uncachable cursors driven by ECPGdo() and cachable cursors driven by the new runtime functions. With the 2nd patch applied, this distinction is no more. The test case cursor-readahead.pgc has a comment saying test automatic prepare for all statements. Copy/Paste error? It must be, yes. I cannot find a test that tests the environment variable giving the fetch size. Could you please point me to that? I didn't write such a test. The reason is that while variables are exported by make from the Makefile to the binaries run by make e.g. CFLAGS et.al. for $(CC), make check simply runs pg_regress once which uses its own configuration file that doesn't have a way to set or unset an environment variable. This could be a useful extension to pg_regress though. Best regards, Zoltán Böszörményi Michael -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sun, Apr 08, 2012 at 06:35:33PM +0200, Boszormenyi Zoltan wrote: Do you want me to change this or will you do it? I am on holiday and will be back to work on wednesday. I don't think waiting till later this week is a real problem. The possibility to test different readahead window sizes without modifying the source and recompiling was useful. Sure, but you can still do that when not defining a fixed number in the statement. The -R option simply provides a default without ornamenting the DECLARE statement. Could you please incorporate these changes, too, when you're back from vacation? I cannot find a test that tests the environment variable giving the fetch size. Could you please point me to that? I didn't write such a test. The reason is that while variables are exported by make from the Makefile to the binaries run by make e.g. CFLAGS et.al. for $(CC), make check simply runs pg_regress once which uses its own configuration file that doesn't have a way to set or unset an environment variable. This could be a useful extension to pg_regress though. How about calling setenv() from the test program itself? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sun, Apr 08, 2012 at 04:25:01PM +0200, Michael Meskes wrote: On Sat, Apr 07, 2012 at 11:50:42AM -0400, Noah Misch wrote: I do call your attention to a question I raised in my second review: if a program contains DECLARE foo READAHEAD 5 CURSOR FOR ... and the user runs the program with ECPGFETCHSZ=10 in the environment, should that cursor use a readahead window of 5 or of 10? Original commentary: http://archives.postgresql.org/message-id/20120329004323.ga17...@tornado.leadboat.com I'd say it should be 5. I don't like an environment variable overwriting a hard-coded setting. I think this is what you, Noah, thought, too, right? Yes. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Fri, Mar 30, 2012 at 12:48:07AM +0200, Boszormenyi Zoltan wrote: Attached is the new core feature patch. Summary of changes: ... I also refreshed the second patch that drives all cursors with the new ... I'm slightly confused here. It seems Zoltan added a second patch *after* Noah marked this patch as ready for committer. That second patch seems to apply cleanly after the first one got applied. Now, which one was reviewed and is considered ready for commit? The first one? Or both? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sat, Apr 07, 2012 at 01:20:08PM +0200, Michael Meskes wrote: On Fri, Mar 30, 2012 at 12:48:07AM +0200, Boszormenyi Zoltan wrote: Attached is the new core feature patch. Summary of changes: ... I also refreshed the second patch that drives all cursors with the new ... I'm slightly confused here. It seems Zoltan added a second patch *after* Noah marked this patch as ready for committer. That second patch seems to apply cleanly after the first one got applied. Now, which one was reviewed and is considered ready for commit? The first one? Or both? Both. The second patch appeared after my first review, based on a comment in that review. I looked at it during my re-review before marking the overall project Ready for Committer. I do call your attention to a question I raised in my second review: if a program contains DECLARE foo READAHEAD 5 CURSOR FOR ... and the user runs the program with ECPGFETCHSZ=10 in the environment, should that cursor use a readahead window of 5 or of 10? Original commentary: http://archives.postgresql.org/message-id/20120329004323.ga17...@tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Fri, Mar 30, 2012 at 12:48:07AM +0200, Boszormenyi Zoltan wrote: 2012-03-29 19:03 keltez?ssel, Noah Misch ?rta: one of the new sections about readahead should somehow reference the hazard around volatile functions. Done. I don't see the mention in your latest patch. You do mention it for the sqlerrd[2] compatibility stuff. sqlerrd[2] compatibility stuff? I mentioned it in section ecpg-sqlca, this is the main documentation section, not the compatibility one AFAIK. Anyway, I now reference the volatile function hazard in the first paragraphs added to section ecpg-cursors. This patch adds two features, and those features are independent from a user perspective. The primary feature is cursor readahead, and the secondary feature is ecpg --detect-cursor-resultset-size (the referent of my above sqlerrd[2] compatibility stuff reference). Each feature has independent semantic implications when the application uses cursors on queries that call volatile functions. Under --detect-cursor-resultset-size, we will execute functions for all rows at OPEN time and again for each row at FETCH time. When you declare a cursor with READAHEAD n and do not FETCH it to the end, up to n unFETCHed rows will nonetheless have their functions executed. If the volatile function is something like clock_timestamp(), the application will observe the executions to have happened in clusters of n rather than in step with the application's FETCH calls. Your latest patch revision hints at the semantic implications for ecpg --detect-cursor-resultset-size, but it does not mention them for readahead. Then again, perhaps it's sufficiently obvious to not warrant mention. Without knowing internals, I would not expect users to guess the consequence of ecpg --detect-cursor-resultset-size. With readahead, it may be guessable enough. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-03-29 12:59 keltezéssel, Boszormenyi Zoltan írta: 2012-03-29 02:43 keltezéssel, Noah Misch írta: On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: +the window size may be modified by setting theliteralECPGFETCHSZ/literal +environment variable to a different value. I had in mind that DECLARE statements adorned with READAHEAD syntax would always behave precisely as written, independent of ECPGFETCHSZ or ecpg -R. Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value passed to ecpg -R if provided, and finally a value of 1 (no readahead) in the absence of both ECPGFETCHSZ and ecpg -R. Did you do it differently for a particular reason? I don't particularly object to what you've implemented, but I'd be curious to hear your thinking. What I had in mind was: NO READAHEAD == READAHEAD 0. This will translate into 1 tuple sized readahead window that cannot be modified by ECPGFETCHSZ. After the core patch, this is not totally true yet. The NO READAHEAD cursors don't go through the new cursor functions at all. But cursor.c is prepared for the second patch that makes all cursors use the caching code. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Thu, Mar 29, 2012 at 12:59:40PM +0200, Boszormenyi Zoltan wrote: 2012-03-29 02:43 keltez?ssel, Noah Misch ?rta: On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: +toliteralREADAHEAD number/literal. ExplicitliteralREADAHEAD number/literal or +literalNO READAHEAD/literal turns cursor readahead on (withliteralnumber/literal +as the size of the readahead window) or off for the specified cursor, +respectively. For cursors using a non-0 readahead window size is 256 rows, The number 256 is no longer present in your implementation. Indeed. Oversight, that part of the sentence is deleted. +the window size may be modified by setting theliteralECPGFETCHSZ/literal +environment variable to a different value. I had in mind that DECLARE statements adorned with READAHEAD syntax would always behave precisely as written, independent of ECPGFETCHSZ or ecpg -R. Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value passed to ecpg -R if provided, and finally a value of 1 (no readahead) in the absence of both ECPGFETCHSZ and ecpg -R. Did you do it differently for a particular reason? I don't particularly object to what you've implemented, but I'd be curious to hear your thinking. What I had in mind was: NO READAHEAD == READAHEAD 0. This will translate into 1 tuple sized readahead window that cannot be modified by ECPGFETCHSZ. READAHEAD 1 also means uncached by default but ECPGFETCHSZ may modify the readahead window size. To me, it feels too magical to make READAHEAD 1 and READAHEAD 0 differ only in whether ECPGFETCHSZ can override. I'm willing to go with whatever consensus arises on this topic, though. This part is policy that can be debated and modified accordingly, it doesn't affect the internals of the caching code. Agreed. +/para + +para +commandUPDATE/command orcommandDELETE/command with the +literalWHERE CURRENT OF/literal clause, cursor readahead may or may not +actually improve performance, ascommandMOVE/command statement has to be +sent to the backend before the DML statement to ensure correct cursor position +in the backend. +/para This sentence seems to be missing a word near its beginning. Sounds like a corner case to me, but I am not a native english speaker. Which one sounds better: UPDATE or DELETE with the WHERE CURRENT OF clause... or UPDATE or DELETE statements with the WHERE CURRENT OF clause... ? Here is the simplest change to make the original grammatical: Given an UPDATE or DELETE with the WHERE CURRENT OF clause, ... I might write the whole thing this way: To execute an UPDATE or DELETE statement bearing the WHERE CURRENT OF clause, ECPG may first execute an implicit MOVE to synchronize the server cursor position with the local cursor position. This can reduce or eliminate the performance benefit of readahead for affected cursors. one of the new sections about readahead should somehow reference the hazard around volatile functions. Done. I don't see the mention in your latest patch. You do mention it for the sqlerrd[2] compatibility stuff. +varlistentry +termliteralOPEN RETURNS LAST ROW POSITION/literal/term +termliteralOPEN RETURNS 0 FOR LAST ROW POSITION/literal/term +listitem +para + When the cursor is opened, it's possible to discover the size of the result set + usingcommandMOVE ALL/command which traverses the result set and move + the cursor back to the beginning usingcommandMOVE ABSOLUTE 0/command. + The size of the result set is returned in sqlca.sqlerrd[2]. +/para + +para + This slows down opening the cursor from the application point of view + but may also have side effects. If the cursor query contains volatile function + calls with side effects, they will be evaluated twice because of traversing + the result set this way duringcommandOPEN/command. +/para + +para + The default is not to discover. +/para I mildly oppose having syntax to enable this per-cursor. Readahead is a generally handy feature that I might use in new programs, but this feature is more of a hack for compatibility with some old Informix version. For new code, authors should do their own MOVEs instead of using this option. The patch also adds an undocumented ECPGOPENRETURNSRESULTSETSIZE environment variable to control this. I see no use for such a thing, because a program's dependency on sqlerrd[2] is fixed at build time. If a program does not reference sqlerrd[2] for a cursor, there's no benefit from choosing at runtime to populate that value anyway. If a program does reference it, any option needed to have it set correctly had better be set at build time and apply to every run of the final program. IOW, I suggest having only the ecpg-time option to enable this behavior. Thoughts? I wanted to make it available
Re: [HACKERS] ECPG FETCH readahead
On Thu, Mar 29, 2012 at 01:03:41PM -0400, Noah Misch wrote: Still, we're looking at dedicated ECPG syntax, quite visible even to folks with no interest in Informix. We have eschewed littering our syntax with compatibility aids, and I like it that way. IMO, an option to the ecpg preprocessor is an acceptable level of muddle to provide this aid. A DECLARE CURSOR syntax extension goes too far. +1 from me. Let's not add special ecpg sql syntax if we don't absolutely have to. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-03-29 20:34 keltezéssel, Michael Meskes írta: On Thu, Mar 29, 2012 at 01:03:41PM -0400, Noah Misch wrote: Still, we're looking at dedicated ECPG syntax, quite visible even to folks with no interest in Informix. We have eschewed littering our syntax with compatibility aids, and I like it that way. IMO, an option to the ecpg preprocessor is an acceptable level of muddle to provide this aid. A DECLARE CURSOR syntax extension goes too far. +1 from me. Let's not add special ecpg sql syntax if we don't absolutely have to. Michael This is what I waited for, finally another opinion. I will delete this from the grammar. What about the other question about the 0 vs 1 distinction? Without the second patch, 0 drives the cursor with the old ECPGdo() code. All other values drive the cursor via the new ECPGopen() et.al. Should we keep this behaviour? In this case the 0 vs 1 distinction is not needed in add_cursor() as the 0 value doesn't even reach ECPGopen(). But if we consider including the second patch as well, should we keep the NO READAHEAD option in the grammar and in cursor.c? After solving the problem with WHERE CURRENT OF, this and the 0-vs-1 distinction starts to feel pointless. And what about the override via the environment variable? Should it work only upwards? Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: Sorry for the delay, I had been busy with other tasks and I rewrote this code to better cope with unknown result size, scrollable cursors and negative cursor positions. I think all points raised by Noah is addressed: per-cursor readahead window size, extensive comments, documentation and not enabling result set size discovery. The new comments are a nice improvement; thanks. The problem with WHERE CURRENT OF is solved by a little more grammar and ecpglib code, which effectively does a MOVE ABSOLUTE N before executing the DML with WHERE CURRENT OF clause. No patching of the backend. This way, the new ECPG caching code is compatible with older servers but obviously reduces the efficiency of caching. Good plan. diff -dcrpN postgresql.orig/doc/src/sgml/ecpg.sgml postgresql/doc/src/sgml/ecpg.sgml *** postgresql.orig/doc/src/sgml/ecpg.sgml2012-03-12 09:24:31.699560098 +0100 --- postgresql/doc/src/sgml/ecpg.sgml 2012-03-24 10:15:00.538924601 +0100 *** EXEC SQL COMMIT; *** 454,459 --- 454,479 details. /para + para +ECPG may use cursor readahead to improve performance of programs +that use single-row FETCH statements. Option literal-R number/literal +option for ECPG modifies the default for all cursors from literalNO READAHEAD/literal This sentence duplicates the word option. +to literalREADAHEAD number/literal. Explicit literalREADAHEAD number/literal or +literalNO READAHEAD/literal turns cursor readahead on (with literalnumber/literal +as the size of the readahead window) or off for the specified cursor, +respectively. For cursors using a non-0 readahead window size is 256 rows, The number 256 is no longer present in your implementation. +the window size may be modified by setting the literalECPGFETCHSZ/literal +environment variable to a different value. I had in mind that DECLARE statements adorned with READAHEAD syntax would always behave precisely as written, independent of ECPGFETCHSZ or ecpg -R. Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value passed to ecpg -R if provided, and finally a value of 1 (no readahead) in the absence of both ECPGFETCHSZ and ecpg -R. Did you do it differently for a particular reason? I don't particularly object to what you've implemented, but I'd be curious to hear your thinking. + /para + + para +commandUPDATE/command or commandDELETE/command with the +literalWHERE CURRENT OF/literal clause, cursor readahead may or may not +actually improve performance, as commandMOVE/command statement has to be +sent to the backend before the DML statement to ensure correct cursor position +in the backend. + /para This sentence seems to be missing a word near its beginning. *** DECLARE replaceable class=PARAMETERc *** 6639,6649 /listitem /varlistentry /variablelist para ! For the meaning of the cursor options, ! see xref linkend=sql-declare. /para /refsect1 refsect1 --- 6669,6728 /listitem /varlistentry /variablelist +/refsect1 + +refsect1 + titleCursor options/title para ! For the meaning of other cursor options, see xref linkend=sql-declare. /para + + variablelist + varlistentry + termliteralREADAHEAD number/literal/term + termliteralNO READAHEAD/literal/term +listitem + para + literalREADAHEAD number/literal makes the ECPG preprocessor and + runtime library use a client-side cursor accounting and data readahead + during commandFETCH/command. This improves performance for programs + that use single-row commandFETCH/command statements. + /para + + para + literalNO READAHEAD/literal disables data readahead in case + parameter-R number/parameter is used for compiling the file. + /para +/listitem + /varlistentry One of the new sections about readahead should somehow reference the hazard around volatile functions. + + varlistentry + termliteralOPEN RETURNS LAST ROW POSITION/literal/term + termliteralOPEN RETURNS 0 FOR LAST ROW POSITION/literal/term +listitem + para + When the cursor is opened, it's possible to discover the size of the result set + using commandMOVE ALL/command which traverses the result set and move + the cursor back to the beginning using commandMOVE ABSOLUTE 0/command. + The size of the result set is returned in sqlca.sqlerrd[2]. + /para + + para + This slows down opening the cursor from the application point of view + but may also have side effects. If the cursor query
Re: [HACKERS] ECPG FETCH readahead
On Tue, Mar 6, 2012 at 6:06 AM, Noah Misch n...@leadboat.com wrote: On Tue, Mar 06, 2012 at 07:07:41AM +0100, Boszormenyi Zoltan wrote: 2012-03-05 19:56 keltez?ssel, Noah Misch ?rta: Or how about a new feature in the backend, so ECPG can do UPDATE/DELETE ... WHERE OFFSET N OF cursor and the offset of computed from the actual cursor position and the position known by the application? This way an app can do readahead and do work on rows collected by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF behind the scenes. That's a neat idea, but I would expect obstacles threatening our ability to use it automatically for readahead. You would have to make the cursor a SCROLL cursor. We'll often pass a negative offset, making the operation fail if the cursor query used FOR UPDATE. Volatile functions in the query will get more calls. That's assuming the operation will map internally to something like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with innovations to mitigate those obstacles, but those innovations would probably also apply to MOVE/FETCH. In any event, this would constitute a substantive patch in its own right. I was thinking along the lines of a Portal keeping the ItemPointerData for each tuple in the last FETCH statement. The WHERE OFFSET N OF cursor would treat the offset value relative to the tuple order returned by FETCH. So, OFFSET 0 OF == CURRENT OF and other values of N are negative. This way, it doesn't matter if the cursor is SCROLL, NO SCROLL or have the default behaviour with SCROLL in some cases. Then ECPGopen() doesn't have to play games with the DECLARE statement. Only ECPGfetch() needs to play with MOVE statements, passing different offsets to the backend, not what the application passed. That broad approach sounds promising. The main other consideration that comes to mind is a plan to limit resource usage for a cursor that reads, say, 1B rows. However, I think attempting to implement this now will significantly decrease the chance of getting the core patch features committed now. One way out of trouble here is to make WHERE CURRENT OF imply READHEAD 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the affected cursor. If the cursor has some other readahead quantity declared explicitly, throw an error during preprocessing. I played with this idea a while ago, from a different point of view. If the ECPG code had the DECLARE mycur, DML ... WHERE CURRENT OF mycur and OPEN mycur in exactly this order, i.e. WHERE CURRENT OF appears in a standalone function between DECLARE and the first OPEN for the cursor, then ECPG disabled readahead automatically for that cursor and for that cursor only. But this requires effort on the user of ECPG and can be very fragile. Code cleanup with reordering functions can break previously working code. Don't the same challenges apply to accurately reporting an error when the user specifies WHERE CURRENT OF for a readahead cursor? I think we need either an updated version of this patch that's ready for commit real soon now, or we need to postpone it to 9.3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Tue, Mar 06, 2012 at 07:07:41AM +0100, Boszormenyi Zoltan wrote: 2012-03-05 19:56 keltez?ssel, Noah Misch ?rta: Or how about a new feature in the backend, so ECPG can do UPDATE/DELETE ... WHERE OFFSET N OF cursor and the offset of computed from the actual cursor position and the position known by the application? This way an app can do readahead and do work on rows collected by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF behind the scenes. That's a neat idea, but I would expect obstacles threatening our ability to use it automatically for readahead. You would have to make the cursor a SCROLL cursor. We'll often pass a negative offset, making the operation fail if the cursor query used FOR UPDATE. Volatile functions in the query will get more calls. That's assuming the operation will map internally to something like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with innovations to mitigate those obstacles, but those innovations would probably also apply to MOVE/FETCH. In any event, this would constitute a substantive patch in its own right. I was thinking along the lines of a Portal keeping the ItemPointerData for each tuple in the last FETCH statement. The WHERE OFFSET N OF cursor would treat the offset value relative to the tuple order returned by FETCH. So, OFFSET 0 OF == CURRENT OF and other values of N are negative. This way, it doesn't matter if the cursor is SCROLL, NO SCROLL or have the default behaviour with SCROLL in some cases. Then ECPGopen() doesn't have to play games with the DECLARE statement. Only ECPGfetch() needs to play with MOVE statements, passing different offsets to the backend, not what the application passed. That broad approach sounds promising. The main other consideration that comes to mind is a plan to limit resource usage for a cursor that reads, say, 1B rows. However, I think attempting to implement this now will significantly decrease the chance of getting the core patch features committed now. One way out of trouble here is to make WHERE CURRENT OF imply READHEAD 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the affected cursor. If the cursor has some other readahead quantity declared explicitly, throw an error during preprocessing. I played with this idea a while ago, from a different point of view. If the ECPG code had the DECLARE mycur, DML ... WHERE CURRENT OF mycur and OPEN mycur in exactly this order, i.e. WHERE CURRENT OF appears in a standalone function between DECLARE and the first OPEN for the cursor, then ECPG disabled readahead automatically for that cursor and for that cursor only. But this requires effort on the user of ECPG and can be very fragile. Code cleanup with reordering functions can break previously working code. Don't the same challenges apply to accurately reporting an error when the user specifies WHERE CURRENT OF for a readahead cursor? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sun, Mar 04, 2012 at 05:34:50PM +0100, Boszormenyi Zoltan wrote: The program logic shouldn't change at all. He meant that extra coding effort is needed if you want manual caching. It requires 2 loops instead of 1 if you use FETCH N (N1). Ah, thanks for the explanation. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sun, Mar 04, 2012 at 05:16:06PM +0100, Michael Meskes wrote: On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote: I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. Using 1 to effectively disable the feature is fine with me, but I strongly object any default enabling this feature. It's farily easy to create cases with pathological behaviour and this features is not standard by any means. I figure a normal programmer would expect only one row being transfered when fetching one. On further reflection, I agree with you here. The prospect for queries that call volatile functions changed my mind; they would exhibit different functional behavior under readahead. We mustn't silently give affected programs different semantics. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Sun, Mar 04, 2012 at 04:33:32PM +0100, Boszormenyi Zoltan wrote: 2012-03-02 17:41 keltez?ssel, Noah Misch ?rta: On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote: I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I see. How about 8? Nice round power of 2 value, still small and avoids 87.5% of overhead. Having pondered the matter further, I now agree with Michael that the feature should stay disabled by default. See my response to him for rationale. Assuming that conclusion holds, we can recommended a higher value to users who enable the feature at all. Your former proposal of 256 seems fine. BTW, the default disabled behaviour was to avoid make check breakage, see below. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. This means all code previously going through ECPGdo() would go through ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all regression tests that were only testing certain features would also test the readahead feature, too. It's a good sort of intrusiveness, reducing the likelihood of introducing bugs basically unrelated to readahead that happen to afflict only ECPGdo() or only the cursor.c interfaces. Let's indeed not have any preexisting test cases use readahead per se, but having them use the cursor.c interfaces anyway will build confidence in the new code. The churn in expected debug output isn't ideal, but I don't prefer the alternative of segmenting the implementation for the sake of the test cases. Also, the test for WHERE CURRENT OF at ecpg time would have to be done at runtime, possibly making previously working code fail if ECPGFETCHSZ is enabled. Good point. How about still allowing NO READAHEAD cursors that compile into plain ECPGdo()? This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would mean code changes everywhere where WHERE CURRENT OF is used. ECPGFETCHSZ should only affect cursors that make no explicit mention of READAHEAD. I'm not sure whether that should mean actually routing READHEAD 1 cursors through ECPGdo() or simply making sure that cursor.c achieves the same outcome; see later for a possible reason to still do the latter. Or how about a new feature in the backend, so ECPG can do UPDATE/DELETE ... WHERE OFFSET N OF cursor and the offset of computed from the actual cursor position and the position known by the application? This way an app can do readahead and do work on rows collected by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF behind the scenes. That's a neat idea, but I would expect obstacles threatening our ability to use it automatically for readahead. You would have to make the cursor a SCROLL cursor. We'll often pass a negative offset, making the operation fail if the cursor query used FOR UPDATE. Volatile functions in the query will get more calls. That's assuming the operation will map internally to something like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with innovations to mitigate those obstacles, but those innovations would probably also apply to MOVE/FETCH. In any event, this would constitute a substantive patch in its own right. One way out of trouble here is to make WHERE CURRENT OF imply READHEAD 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the affected cursor. If the cursor has some other readahead quantity declared explicitly, throw an error during preprocessing. Failing a reasonable resolution, I'm prepared to withdraw my suggestion of making ECPGFETCHSZ always-usable. It's nice to have, not critical. +bool +ECPGopen(const int lineno, const int compat, const int force_indicator, + const char *connection_name, const bool questionmarks, + const char *curname, const int st, const char *query, ...) +{ + va_list args; + boolret, scrollable; + char *new_query, *ptr, *whold, *noscroll, *scroll, *dollar0; + struct sqlca_t *sqlca = ECPGget_sqlca(); + + if (!query) + { + ecpg_raise(lineno, ECPG_EMPTY, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); + return false; + } + ptr = strstr(query, for ); + if (!ptr) + { + ecpg_raise(lineno, ECPG_INVALID_STMT, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); + return false; + } + whold = strstr(query, with hold ); + dollar0 = strstr(query, $0); + + noscroll = strstr(query, no scroll ); + scroll = strstr(query, scroll ); A
Re: [HACKERS] ECPG FETCH readahead
2012-03-05 19:56 keltezéssel, Noah Misch írta: Having pondered the matter further, I now agree with Michael that the feature should stay disabled by default. See my response to him for rationale. Assuming that conclusion holds, we can recommended a higher value to users who enable the feature at all. Your former proposal of 256 seems fine. OK. BTW, the default disabled behaviour was to avoid make check breakage, see below. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. This means all code previously going through ECPGdo() would go through ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all regression tests that were only testing certain features would also test the readahead feature, too. It's a good sort of intrusiveness, reducing the likelihood of introducing bugs basically unrelated to readahead that happen to afflict only ECPGdo() or only the cursor.c interfaces. Let's indeed not have any preexisting test cases use readahead per se, but having them use the cursor.c interfaces anyway will build confidence in the new code. The churn in expected debug output isn't ideal, but I don't prefer the alternative of segmenting the implementation for the sake of the test cases. I see. Also, the test for WHERE CURRENT OF at ecpg time would have to be done at runtime, possibly making previously working code fail if ECPGFETCHSZ is enabled. Good point. How about still allowing NO READAHEAD cursors that compile into plain ECPGdo()? This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would mean code changes everywhere where WHERE CURRENT OF is used. ECPGFETCHSZ should only affect cursors that make no explicit mention of READAHEAD. I'm not sure whether that should mean actually routing READHEAD 1 cursors through ECPGdo() or simply making sure that cursor.c achieves the same outcome; see later for a possible reason to still do the latter. Or how about a new feature in the backend, so ECPG can do UPDATE/DELETE ... WHERE OFFSET N OF cursor and the offset of computed from the actual cursor position and the position known by the application? This way an app can do readahead and do work on rows collected by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF behind the scenes. That's a neat idea, but I would expect obstacles threatening our ability to use it automatically for readahead. You would have to make the cursor a SCROLL cursor. We'll often pass a negative offset, making the operation fail if the cursor query used FOR UPDATE. Volatile functions in the query will get more calls. That's assuming the operation will map internally to something like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with innovations to mitigate those obstacles, but those innovations would probably also apply to MOVE/FETCH. In any event, this would constitute a substantive patch in its own right. I was thinking along the lines of a Portal keeping the ItemPointerData for each tuple in the last FETCH statement. The WHERE OFFSET N OF cursor would treat the offset value relative to the tuple order returned by FETCH. So, OFFSET 0 OF == CURRENT OF and other values of N are negative. This way, it doesn't matter if the cursor is SCROLL, NO SCROLL or have the default behaviour with SCROLL in some cases. Then ECPGopen() doesn't have to play games with the DECLARE statement. Only ECPGfetch() needs to play with MOVE statements, passing different offsets to the backend, not what the application passed. One way out of trouble here is to make WHERE CURRENT OF imply READHEAD 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the affected cursor. If the cursor has some other readahead quantity declared explicitly, throw an error during preprocessing. I played with this idea a while ago, from a different point of view. If the ECPG code had the DECLARE mycur, DML ... WHERE CURRENT OF mycur and OPEN mycur in exactly this order, i.e. WHERE CURRENT OF appears in a standalone function between DECLARE and the first OPEN for the cursor, then ECPG disabled readahead automatically for that cursor and for that cursor only. But this requires effort on the user of ECPG and can be very fragile. Code cleanup with reordering functions can break previously working code. Failing a reasonable resolution, I'm prepared to withdraw my suggestion of making ECPGFETCHSZ always-usable. It's nice to have, not critical. +bool +ECPGopen(const int lineno, const int compat, const int force_indicator, + const char *connection_name, const bool questionmarks, + const char *curname, const int st, const char *query, ...) +{ + va_list args; + boolret, scrollable; + char
Re: [HACKERS] ECPG FETCH readahead
Hi, first, thank you for answering and for the review. 2012-03-02 17:41 keltezéssel, Noah Misch írta: On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote: 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta: 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta: On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes mes...@postgresql.org wrote: On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though. We yet lack a consensus on whether native ECPG apps should have access to the feature. I don't even remember about any opinion on this matter. So, at this point don't know whether it's lack of interest. We also have a saying silence means agreement. :-) My 2c is to make it available, because it's useful syntactic sugar. Thanks, we thought the same. If your program independently processes each row of an arbitrary-length result set, current facilities force you to add an extra outer loop to batch the FETCHes for every such code site. Applications could define macros to abstract that pattern, but this seems common-enough to justify bespoke handling. We have similar opinions. Besides, minimalists already use libpq directly. Indeed. On the other hand, ECPG provides a safety net with syntax checking so it's useful for not minimalist types. :-) I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I see. How about 8? Nice round power of 2 value, still small and avoids 87.5% of overhead. BTW, the default disabled behaviour was to avoid make check breakage, see below. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. This means all code previously going through ECPGdo() would go through ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all regression tests that were only testing certain features would also test the readahead feature, too. Also, the test for WHERE CURRENT OF at ecpg time would have to be done at runtime, possibly making previously working code fail if ECPGFETCHSZ is enabled. How about still allowing NO READAHEAD cursors that compile into plain ECPGdo()? This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would mean code changes everywhere where WHERE CURRENT OF is used. Or how about a new feature in the backend, so ECPG can do UPDATE/DELETE ... WHERE OFFSET N OF cursor and the offset of computed from the actual cursor position and the position known by the application? This way an app can do readahead and do work on rows collected by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF behind the scenes. The ASAP took a little long. The attached patch is in git diff format, because (1) the regression test intentionally doesn't do ECPGdebug() so the patch isn't dominated by a 2MB stderr file, so this file is empty and (2) regular diff cannot cope with empty new files. Avoid the empty file with fprintf(STDERR, Dummy non-empty error output\n); Fixed. - *NEW FEATURE* Readahead can be individually enabled or disabled by ECPG-side grammar: DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ... Without [ NO ] READAHEAD, the default behaviour is used for cursors. Likewise, this may as well take a chunk size rather than a yes/no. Done. The patch adds warnings: error.c: In function `ecpg_raise': error.c:281: warning: format not a string literal and no format arguments error.c:281: warning: format not a string literal and no format arguments Fixed. The patch adds few comments and no larger comments explaining its higher-level ideas. That makes it much harder to review. In this regard it follows the tradition of the ECPG code, but let's depart from that tradition for new work. I mention a few cases below where the need for commentary is acute. Understood. Adding comments as I go over that code again. I tested full reads of various SELECT * FROM generate_series(1, $1) commands over a 50ms link, and the patch gives a sound performance improvement. With no readahead, a mere N=100 takes 5s to drain. With readahead enabled, N=1 takes only 3s. Performance was quite similar to that of
Re: [HACKERS] ECPG FETCH readahead
On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote: We yet lack a consensus on whether native ECPG apps should have access to the feature. My 2c is to make it available, because it's useful syntactic sugar. If your program independently processes each row of an arbitrary-length result set, current facilities force you to add an extra outer loop to batch the FETCHes for every such code site. Applications could define macros to abstract that pattern, but this seems common-enough to justify bespoke handling. Besides, minimalists already use libpq directly. Sorry, I don't really understand what you're saying here. The program logic won't change at all when using this feature or what do I misunderstand? I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. Using 1 to effectively disable the feature is fine with me, but I strongly object any default enabling this feature. It's farily easy to create cases with pathological behaviour and this features is not standard by any means. I figure a normal programmer would expect only one row being transfered when fetching one. Other than that, thanks for the great review. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2012-03-04 17:16 keltezéssel, Michael Meskes írta: On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote: We yet lack a consensus on whether native ECPG apps should have access to the feature. My 2c is to make it available, because it's useful syntactic sugar. If your program independently processes each row of an arbitrary-length result set, current facilities force you to add an extra outer loop to batch the FETCHes for every such code site. Applications could define macros to abstract that pattern, but this seems common-enough to justify bespoke handling. Besides, minimalists already use libpq directly. Sorry, I don't really understand what you're saying here. The program logic won't change at all when using this feature or what do I misunderstand? The program logic shouldn't change at all. He meant that extra coding effort is needed if you want manual caching. It requires 2 loops instead of 1 if you use FETCH N (N1). I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. Using 1 to effectively disable the feature is fine with me, Something else would be needed. For DML with WHERE CURRENT OF cursor, the feature should stay disabled even with the environment variable is set without adding any decoration to the DECLARE statement. The safe thing would be to distinguish between uncached (but cachable) and uncachable cases. A single value cannot work. but I strongly object any default enabling this feature. It's farily easy to create cases with pathological behaviour and this features is not standard by any means. I figure a normal programmer would expect only one row being transfered when fetching one. Other than that, thanks for the great review. Michael Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote: 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta: 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta: On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes mes...@postgresql.org wrote: On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though. We yet lack a consensus on whether native ECPG apps should have access to the feature. My 2c is to make it available, because it's useful syntactic sugar. If your program independently processes each row of an arbitrary-length result set, current facilities force you to add an extra outer loop to batch the FETCHes for every such code site. Applications could define macros to abstract that pattern, but this seems common-enough to justify bespoke handling. Besides, minimalists already use libpq directly. I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. The ASAP took a little long. The attached patch is in git diff format, because (1) the regression test intentionally doesn't do ECPGdebug() so the patch isn't dominated by a 2MB stderr file, so this file is empty and (2) regular diff cannot cope with empty new files. Avoid the empty file with fprintf(STDERR, Dummy non-empty error output\n); - *NEW FEATURE* Readahead can be individually enabled or disabled by ECPG-side grammar: DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ... Without [ NO ] READAHEAD, the default behaviour is used for cursors. Likewise, this may as well take a chunk size rather than a yes/no. The patch adds warnings: error.c: In function `ecpg_raise': error.c:281: warning: format not a string literal and no format arguments error.c:281: warning: format not a string literal and no format arguments The patch adds few comments and no larger comments explaining its higher-level ideas. That makes it much harder to review. In this regard it follows the tradition of the ECPG code, but let's depart from that tradition for new work. I mention a few cases below where the need for commentary is acute. I tested full reads of various SELECT * FROM generate_series(1, $1) commands over a 50ms link, and the patch gives a sound performance improvement. With no readahead, a mere N=100 takes 5s to drain. With readahead enabled, N=1 takes only 3s. Performance was quite similar to that of implementing my own batching with FETCH 256 FROM cur. When I kicked up ECPGFETCHSZ (after fixing its implementation -- see below) past the result set size, performance was comparable to that of simply passing the query through psql. --- a/doc/src/sgml/ecpg.sgml +++ b/doc/src/sgml/ecpg.sgml @@ -5289,6 +5315,17 @@ while (1) /varlistentry varlistentry + term-231 (symbolECPG_INVALID_CURSOR/symbol)/term + listitem + para + The cursor you are trying to use with readahead has not been opened yet (SQLSTATE 34000), + invalid values were passed to libecpg (SQLSTATE 42P11) hor not in prerequisite state, i.e. Typo. --- /dev/null +++ b/src/interfaces/ecpg/ecpglib/cursor.c @@ -0,0 +1,730 @@ cursor.c contains various 78-col lines. pgindent has limited ability to improve those, so please split them. +static struct cursor_descriptor * +add_cursor(int lineno, struct connection *con, const char *name, bool scrollable, int64 n_tuples, bool *existing) +{ + struct cursor_descriptor *desc, + *ptr, *prev = NULL; + boolfound = false; + + if (!name || name[0] == '\0') + { + if (existing) + *existing = false; + return NULL; + } + + ptr = con-cursor_desc; + while (ptr) + { + int ret = strcasecmp(ptr-name, name); + + if (ret == 0) + { + found = true; + break; + } + if (ret 0) + break; + + prev = ptr; + ptr = ptr-next; + } Any reason not to use find_cursor() here?
Re: [HACKERS] ECPG FETCH readahead
On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes mes...@postgresql.org wrote: On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though. So, who has the next action on this patch? Does Zoltan need to revise it, or does Michael need to review it, or where are we? Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Hi, Robert Haas írta: On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes mes...@postgresql.org wrote: On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though. So, who has the next action on this patch? Does Zoltan need to revise it, or does Michael need to review it, or where are we? Michael reviewed it shortly in private and I need to send a new revision anyway, regardless of his comments. I will refresh it ASAP. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Hi, 2010-06-23 22:42 keltezéssel, Bruce Momjian írta: Boszormenyi Zoltan wrote: Hi, we improved ECPG quite a lot in 9.0 because we worked and still working with an Informix to PostgreSQL migration project. We came across a pretty big performance problem that can be seen in every naive application that uses only FETCH 1, FETCH RELATIVE or FETCH ABSOLUTE. These are almost the only FETCH variations usable in Informix, i.e. it doesn't have the grammar for fetching N rows at once. Instead, the Client SDK libraries do caching themselves behind the scenes to reduce network turnaround time. I assume our ecpg version supports1 fetch values, even in Informix mode. Does it make sense to add lots of code to our ecpg then? I think, yes, it does make sense. Because we are talking about porting a whole lot of COBOL applications. The ESQL/C or ECPG connector was already written the Informix quirks in mind, so it fetches only one record at a time passing it to the application. And similar performance is expected from ECPG - which excpectation is not fulfilled currently because libecpg doesn't do the same caching as ESQL/C does. And FYI, I haven't added a whole lot of code, most of the code changes in the patch is execute.c refactoring. ECPGdo() was split into several functions, the new parts are still doing the same things. I can make the test case much smaller, I only needed to test crossing the readahead window boundary. This would also make the patch much smaller. And this readahead is not on by default, it's only activated by ecpg -r fetch_readahead. Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On 24/06/10 10:27, Böszörményi Zoltán wrote: And this readahead is not on by default, it's only activated by ecpg -r fetch_readahead. Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2010-06-24 11:04 keltezéssel, Heikki Linnakangas írta: On 24/06/10 10:27, Böszörményi Zoltán wrote: And this readahead is not on by default, it's only activated by ecpg -r fetch_readahead. Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. Because in the first step I wanted to minimize the impact on regression test stderr results. This is what I mentioned in the initial mail, I stuck to the original wording of ecpg_log() messages in the split-up parts of the original ECPGdo() and ecpg_execute() exactly for this reason. The usual policy for ecpg_log() is to report the function name where it was issued. I was also thinking about a new feature for pg_regress, to compare stdout results of two regression tests automatically so a difference can be reported as an error. It would be good for automated testing of features in ECPG that can be toggled, like auto-prepare and fetch readahead. It might come in handy in other subsystems, too. Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Wed, Jun 23, 2010 at 04:42:37PM -0400, Bruce Momjian wrote: I assume our ecpg version supports 1 fetch values, even in Informix mode. Does it make sense to add lots of code to our ecpg then? Yes, it does. The big question that zoltan and I haven't figured out yet, is whether it makes sense to add all this even for native ecpg mode. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
I think, yes, it does make sense. Because we are talking about porting a whole lot of COBOL applications. COBOL??? The ESQL/C or ECPG connector was already written the Informix quirks in mind, so it fetches only one record at a time passing it to the application. And similar performance is expected from ECPG - which excpectation is not fulfilled currently because libecpg doesn't do the same caching as ESQL/C does. Eh, you are talking about a program you wrote for your customer or they wrote themselves, right? I simply refuse to add this stuff only to fix this situation for that one customer of yours if it only hits them. Now the thing to discuss is how common is this situation. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Jun 24, 2010, at 2:13 PM, Michael Meskes wrote: I think, yes, it does make sense. Because we are talking about porting a whole lot of COBOL applications. COBOL??? yes, COBOL :). it is much more common than people think. it is not the first COBOL request for PostgreSQL hitting my desk. in our concrete example we are using a C module written with ECPG which is magically attached to tons of COBOL code ... The ESQL/C or ECPG connector was already written the Informix quirks in mind, so it fetches only one record at a time passing it to the application. And similar performance is expected from ECPG - which excpectation is not fulfilled currently because libecpg doesn't do the same caching as ESQL/C does. Eh, you are talking about a program you wrote for your customer or they wrote themselves, right? I simply refuse to add this stuff only to fix this situation for that one customer of yours if it only hits them. Now the thing to discuss is how common is this situation. Michael i think that this cursor issue is a pretty common thing for many codes. people are usually not aware of the fact that network round trips and parsing which are naturally related to FETCH 1 are a lot more expensive than fetching one row somewhere deep inside the DB engine. out there there are many applications which fetch data row by row. if an app fetches data row by row in PostgreSQL it will be A LOT slower than in, say, Informix because most commercial database clients will cache data inside a cursor behind the scenes to avoid the problem we try to solve. currently we are talking about a performance penalty of factor 5 or so. so - it is not a small thing; it is a big difference. regards, hans -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
2010-06-24 14:13 keltezéssel, Michael Meskes írta: I think, yes, it does make sense. Because we are talking about porting a whole lot of COBOL applications. COBOL??? Yes, OpenCOBOL... The ESQL/C or ECPG connector was already written the Informix quirks in mind, so it fetches only one record at a time passing it to the application. And similar performance is expected from ECPG - which excpectation is not fulfilled currently because libecpg doesn't do the same caching as ESQL/C does. Eh, you are talking about a program you wrote for your customer or they wrote themselves, right? I simply refuse to add this stuff only to fix this situation for that one customer of yours if it only hits them. Now the thing to discuss is how common is this situation. The OpenCOBOL database connector was written by them but the problem is more generic. There are many naive applications (elsewhere, too) using cursors but fetching one record at a time perhaps for portability reasons. This patch provides a big performance boost for those. Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Boszormenyi Zoltan wrote: Hi, we improved ECPG quite a lot in 9.0 because we worked and still working with an Informix to PostgreSQL migration project. We came across a pretty big performance problem that can be seen in every naive application that uses only FETCH 1, FETCH RELATIVE or FETCH ABSOLUTE. These are almost the only FETCH variations usable in Informix, i.e. it doesn't have the grammar for fetching N rows at once. Instead, the Client SDK libraries do caching themselves behind the scenes to reduce network turnaround time. I assume our ecpg version supports 1 fetch values, even in Informix mode. Does it make sense to add lots of code to our ecpg then? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers