Re: [HACKERS] PoC: Partial sort
On Sat, Dec 28, 2013 at 9:28 PM, Alexander Korotkov aekorot...@gmail.comwrote: On Tue, Dec 24, 2013 at 6:02 AM, Andreas Karlsson andr...@proxel.sewrote: Attached revision of patch implements partial sort usage in merge joins. I'm looking forward to doing a bit of testing on this patch. I think it is a really useful feature to get a bit more out of existing indexes. I was about to test it tonight, but I'm having trouble getting the patch to compile... I'm really wondering which compiler you are using as it seems you're declaring your variables in some strange places.. See nodeSort.c line 101. These variables are declared after there has been an if statement in the same scope. That's not valid in C. (The patch did however apply without any complaints). Here's a list of the errors I get when compiling with visual studios on windows. D:\Postgres\c\pgsql.sln (default target) (1) - D:\Postgres\c\postgres.vcxproj (default target) (2) - (ClCompile target) - src\backend\executor\nodeSort.c(101): error C2275: 'Sort' : illegal use of this type as an expression [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(101): error C2065: 'plannode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(102): error C2275: 'PlanState' : illegal use of this type as an expression [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(102): error C2065: 'outerNode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(103): error C2275: 'TupleDesc' : illegal use of this type as an expression [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(103): error C2146: syntax error : missing ';' before identifier 'tupDesc' [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(103): error C2065: 'tupDesc' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(120): error C2065: 'outerNode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(121): error C2065: 'tupDesc' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(121): error C2065: 'outerNode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(125): error C2065: 'tupDesc' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(126): error C2065: 'plannode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(126): error C2223: left of '-numCols' must point to struct/union [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(127): error C2065: 'plannode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(127): error C2223: left of '-sortColIdx' must point to struct/union [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(128): error C2065: 'plannode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(128): error C2223: left of '-sortOperators' must point to struct/union [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(129): error C2065: 'plannode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(129): error C2223: left of '-collations' must point to struct/union [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(130): error C2065: 'plannode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(130): error C2223: left of '-nullsFirst' must point to struct/union [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(132): error C2198: 'tuplesort_begin_heap' : too few arguments for call [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(143): error C2065: 'outerNode' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] src\backend\executor\nodeSort.c(167): error C2065: 'tupDesc' : undeclared identifier [D:\Postgres\c\postgres.vcxproj] 13 Warning(s) 24 Error(s) Regards David Rowley
[HACKERS] control to don't toast one new type
I create type based on varlena. I want control it that don't toast. how to control the length? I must control all of varlena size or data varlena size? thank.
Re: [HACKERS] [bug fix] connection service file doesn't take effect with ECPG apps
On Sat, Dec 28, 2013 at 04:37:44PM +0900, MauMau wrote: You can confirm it by adding the following code fragment to ecpglib/connect.c. The attached file contains this. ... Sorry for not being precide enough. I did run some tests and it works like a charm for me. Or in other words, I used the connect command you had in your email with a services file pointing to a local database and it connected to that database. Instead of adding an additional output I checked the log output which suggested that host was NULL. While I cannot see how your patch could do any harm, I'd still like to figure out why I don't see the same behaviour. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] ECPG app crashes due to SIGBUS on SPARC Solaris
On Sat, Dec 28, 2013 at 08:04:09AM +0900, MauMau wrote: OK, I'll run the ECPG regression test on Solaris without the patch. Please wait until Jan 6 2014 or so, because we've just entered new year holidays here in Japan. Sure, we're no in a particular hurry. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom I've committed this after significant editorialization --- most Tom notably, I pushed control of the sort step into the aggregate Tom support functions. Initial tests suggest that your version is ~40% slower than ours on some workloads. On my system, this query takes ~950ms using our dev branch of the code, and ~1050ms on git master (using \timing in psql for timings, and taking the best of many consecutive runs): select count(*) from (select percentile_disc(0.5) within group (order by i) from generate_series(1,3) i, generate_series(1,10) j group by j) s; About ~700ms of that is overhead, as tested by running this query with enable_hashagg=false: select count(*) from (select j from generate_series(1,3) i, generate_series(1,10) j group by j) s; So your version is taking 350ms for the percentile calculations compared to 250ms for ours. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Regression tests in windows ignore white space
On Thu, Dec 26, 2013 at 3:02 PM, David Rowley dgrowle...@gmail.com wrote: In the following thread I discovered that my new regression tests worked perfectly on windows, but when they were run on linux they failed. http://www.postgresql.org/message-id/CAApHDvo_YCiPYGDz07MpX9o6EGg=3mmyJTb0ysPTwoTg3c=t...@mail.gmail.com After looking at pg_regress to see which options it passes to diff I discovered that it passes -w on windows to ignore ALL white space. The attached simple patch changes this so that it only ignores carriage returns. It does this by passing --strip-trailing-cr to diff instead of -w. This should help us few developers who use windows to get our white space correct in out expected results so that the tests also pass on non windows platforms. I also faced some similar issue few days back and tried to search a bit for some simpler option, but couldn't spend too much time. I think it would be good if we can make it work without -w option. When I tried with your patch on windows, it results into following: == running regression test queries== test tablespace ... diff: unrecognized option `--strip-trailing-cr ' diff: Try `diff --help' for more information. diff command failed with status 2: diff --strip-trailing-cr e:/../postgresql/src/test/regress/expected/tablespace.out e:/../postgresql/src/test/regress/results/tablespace.out e:/../postgresql/src/test/regress/results/tablespace.out.diff Which version of diff you are using? Version of diff on my m/c is: diff - GNU diffutils version 2.7 With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] connection service file doesn't take effect with ECPG apps
From: Michael Meskes mes...@postgresql.org Or in other words, I used the connect command you had in your email with a services file pointing to a local database and it connected to that database. Instead of adding an additional output I checked the log output which suggested that host was NULL. Your test case seems different from my original mail. In my test case, I want to connect to a database on another machine, not on the local one. For example: 1. The ECPG app runs on a machine called client-host. 2. The database server to connect to runs on a machine called server-host. 3. There's no database server running on client-host. 4. The ECPG app uses the connection service file whose contents is: [my_service] host=server-host ... other necessary parameters like port, dbname, etc. The app mistakenly tries to connect to the database server on the local machine (client-host), instead of the desired server-host, and fails to connect because the database server is not running on client-host. This is because ECPGconnect() passes an empty string (), not NULL, to PQconnectdbParams(). Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Regression tests in windows ignore white space
On Sun, Dec 29, 2013 at 2:29 AM, Amit Kapila amit.kapil...@gmail.comwrote: On Thu, Dec 26, 2013 at 3:02 PM, David Rowley dgrowle...@gmail.com wrote: In the following thread I discovered that my new regression tests worked perfectly on windows, but when they were run on linux they failed. http://www.postgresql.org/message-id/CAApHDvo_YCiPYGDz07MpX9o6EGg=3mmyJTb0ysPTwoTg3c=t...@mail.gmail.com After looking at pg_regress to see which options it passes to diff I discovered that it passes -w on windows to ignore ALL white space. The attached simple patch changes this so that it only ignores carriage returns. It does this by passing --strip-trailing-cr to diff instead of -w. This should help us few developers who use windows to get our white space correct in out expected results so that the tests also pass on non windows platforms. I also faced some similar issue few days back and tried to search a bit for some simpler option, but couldn't spend too much time. I think it would be good if we can make it work without -w option. When I tried with your patch on windows, it results into following: == running regression test queries== test tablespace ... diff: unrecognized option `--strip-trailing-cr ' diff: Try `diff --help' for more information. diff command failed with status 2: diff --strip-trailing-cr e:/../postgresql/src/test/regress/expected/tablespace.out e:/../postgresql/src/test/regress/results/tablespace.out e:/../postgresql/src/test/regress/results/tablespace.out.diff Which version of diff you are using? Version of diff on my m/c is: diff - GNU diffutils version 2.7 I had a bit of a look around on the git repository for diffutils and I've found at least part of the commit which introduced --strip-trailing-cr http://git.savannah.gnu.org/cgit/diffutils.git/commit/?id=eefb9adae1642dcb0e2ac523c79998f466e94e77 Although here I can only see that they've added the command line args and not the actual code which implements stripping the carriage returns. Going by that it seems that was added back in 2001, but the version you're using is a bit older than than, it seems 2.7 is 19 years old! http://git.savannah.gnu.org/cgit/diffutils.git/refs/tags I'm on 2.8.7 which is only 4 years old. I know we don't normally go with bleeding edge, but I wondering if we could make the minimum supported diff version at least 2.8 (at least for windows) which was released in 2002. Regards David Rowley With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] truncating pg_multixact/members
Alvaro Herrera alvhe...@2ndquadrant.com wrote: 1. slru.c doesn't consider file names longer than 4 hexadecimal chars. Fixing (1) is simple: we can have each SLRU user declare how many digits to have in file names. All existing users but pg_multixact/members should declare 4 digits; that one should declare 5. That way, the correct number of zeroes are allocated at the start point and we get nice, equal-width file names. Eventually, predicate.c can change to wider file names and get rid of some strange code it has to deal with overrun. That would be nice. There would be the issue of how to deal with pg_upgrade, though. If I remember correctly, there is no strong reason not to blow away any existing files in the pg_serial subdirectory at startup (the way NOTIFY code does), and at one point I had code to do that. I think we took that code out because the files would be deleted soon enough anyway. Barring objection, deleting them at startup seems like a sane way to handle pg_upgrade issues when we do increase the filename size. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] planner missing a trick for foreign tables w/OR conditions
Robert Haas robertmh...@gmail.com writes: On Tue, Dec 17, 2013 at 12:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: So at this point I'm pretty much talked into it. We could eliminate the dependence on indexes entirely, and replace this code with a step that simply tries to pull single-base-relation quals out of ORs wherever it can find one. You could argue that the produced quals would sometimes not be worth testing for, but we could apply a heuristic that says to forget it unless the estimated selectivity of the extracted qual is less than, I dunno, 0.5 maybe. This is an area where I think things are very different from local and remote tables. For a local table, the cost of transmitting a row from one plan node to another is basically zero. For a remote table, it's potentially far higher, although unfortunately it's hard to know exactly. But if the qual is cheap to evaluate, and we're getting back a lot of rows, I suspect even eliminating 5-10% of them could work out to a win. With a local table, 50% sounds like a reasonable number. I used 90% as a cutoff in the attached draft patch. Not sure if it's worth expending a lot of sweat on the point. Preliminary testing of this makes it look like a good idea. In particular, it will now extract restriction clauses from ORs even if those clauses have nothing to do with indexes, as exhibited in the second new regression test case. (I was unhappy to find that there were no regression tests covering this behavior at all; though I guess that's not so surprising given that orindxpath.c dates from before we could put EXPLAIN output into the regression tests.) A cosmetic issue that has to be resolved is where to put the new code. orindxpath.c is clearly now a misnomer; in fact, I don't think that the code in this form belongs in optimizer/path/ at all. There's some argument for putting it in initsplan.c, but that file is pretty big already. Maybe a new file in optimizer/util/? Any thoughts on that? Also, there's some janitorial cleanup work that remains to be done. I believe that make_restrictinfo_from_bitmapqual is now dead code, and generate_bitmap_or_paths could now be static-ified and probably simplified by dropping its restriction_only option. I've not investigated in detail what we could get rid of once create_or_index_quals is gone. regards, tom lane diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 96fe50f..88a566e 100644 *** a/src/backend/optimizer/path/allpaths.c --- b/src/backend/optimizer/path/allpaths.c *** set_plain_rel_size(PlannerInfo *root, Re *** 362,378 /* Mark rel with estimated output rows, width, etc */ set_baserel_size_estimates(root, rel); - - /* - * Check to see if we can extract any restriction conditions from join - * quals that are OR-of-AND structures. If so, add them to the rel's - * restriction list, and redo the above steps. - */ - if (create_or_index_quals(root, rel)) - { - check_partial_indexes(root, rel); - set_baserel_size_estimates(root, rel); - } } /* --- 362,367 diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 606734a..0e05107 100644 *** a/src/backend/optimizer/path/indxpath.c --- b/src/backend/optimizer/path/indxpath.c *** choose_bitmap_and(PlannerInfo *root, Rel *** 1354,1360 * we can remove this limitation. (But note that this also defends * against flat-out duplicate input paths, which can happen because * match_join_clauses_to_index will find the same OR join clauses that ! * create_or_index_quals has pulled OR restriction clauses out of.) * * For the same reason, we reject AND combinations in which an index * predicate clause duplicates another clause. Here we find it necessary --- 1354,1361 * we can remove this limitation. (But note that this also defends * against flat-out duplicate input paths, which can happen because * match_join_clauses_to_index will find the same OR join clauses that ! * extract_restriction_or_clauses has pulled OR restriction clauses out ! * of.) * * For the same reason, we reject AND combinations in which an index * predicate clause duplicates another clause. Here we find it necessary diff --git a/src/backend/optimizer/path/orindxpath.c b/src/backend/optimizer/path/orindxpath.c index 16f29d3..5a8d73b 100644 *** a/src/backend/optimizer/path/orindxpath.c --- b/src/backend/optimizer/path/orindxpath.c *** *** 15,187 #include postgres.h #include optimizer/cost.h #include optimizer/paths.h #include optimizer/restrictinfo.h ! /*-- ! * create_or_index_quals * Examine join OR-of-AND quals to see if any useful restriction OR * clauses can be extracted. If so, add them to the query. * ! * Although a join clause must reference other relations overall, ! * an OR of ANDs
Re: [HACKERS] [BUG FIX] Version number expressed in octal form by mistake
Kevin Grittner kgri...@ymail.com writes: Oh, I just noticed that this is for the *pg_restore* code, not the pg_dump code, so there isn't necessarily a conflict with the docs. The pg_dump code does match the docs on its version check. The question becomes, for each supported version, what do we want to set into AHX-minRemoteVersion before opening the connection to the target database? Do we really want a 9.4 executable to be attempting to restore to a 7.1 database cluster? What about backpatching? On reflection, I'm not sure that pg_restore as such should be applying any server version check at all. pg_restore itself has precious little to do with whether there will be a compatibility problem; that's mostly down to the DDL that pg_dump put into the archive file. And we don't have enough information to be very sure about whether it will work, short of actually trying it. So why should the code arbitrarily refuse to try? So I'm inclined to propose that we set min/max to 0 and 99 here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] truncating pg_multixact/members
Kevin Grittner wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: 1. slru.c doesn't consider file names longer than 4 hexadecimal chars. Fixing (1) is simple: we can have each SLRU user declare how many digits to have in file names. All existing users but pg_multixact/members should declare 4 digits; that one should declare 5. That way, the correct number of zeroes are allocated at the start point and we get nice, equal-width file names. Eventually, predicate.c can change to wider file names and get rid of some strange code it has to deal with overrun. That would be nice. There would be the issue of how to deal with pg_upgrade, though. If I remember correctly, there is no strong reason not to blow away any existing files in the pg_serial subdirectory at startup (the way NOTIFY code does), and at one point I had code to do that. I think we took that code out because the files would be deleted soon enough anyway. Barring objection, deleting them at startup seems like a sane way to handle pg_upgrade issues when we do increase the filename size. Agreed. It's easy to have the files deleted at startup now that the truncation stuff uses a callback. There is already a callback that's used to delete all files, so you won't need to write any code to make it behave that way. FWIW for pg_multixact/members during pg_upgrade from 9.3 to 9.4 we will need to rename existing files, prepending a zero to each file whose name is four chars in length. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trailing comment ghost-timing
Andres Freund and...@2ndquadrant.com writes: On 2013-12-24 12:27:59 -0500, Tom Lane wrote: What I was proposing was that we do include comments in what we send, as long as those comments are embedded in the statement text, not on lines before it. The common way I've seen what I've described above done as is something like: /* File:path/to/some/file.whatever Function:foo::something() * Comment: Expensive, but that's ok! */ SELECT here_comes FROM my WHERE some_sql; If I unerstood what you propose correctly, we'd now drop that comment, where we sent it before? Yeah. But I just noticed that this would cause the regression test results to change dramatically --- right now, most -- comment comments get echoed to the output, and that would stop. So maybe it's not such a great idea after all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix typo in src/backend/utils/mmgr/README
On Wed, 2013-12-25 at 17:50 +0900, Etsuro Fujita wrote: Peter Eisentraut wrote: On 12/24/13, 1:33 AM, Etsuro Fujita wrote: This is a small patch to fix a typo in src/backend/utils/mmgr/README. I don't think that change is correct. I've fixed the patch, though that might be still wrong. I'm not a native English speaker ;). Please find attached an updated version of the patch. Committed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] work_mem calculation possible overflow
David Rowley dgrowle...@gmail.com writes: I was just looking through a few of the warnings flagged up by PVS Studio. I found some warnings around some calculations that were doing work_mem * 1024L and comparing that to a double. On windows 64 sizeof(long) is 4 bytes. Currently work_mem's maximum value is INT_MAX / 1024, so this should not overflow on windows 64 at the moment, but perhaps if work_mem's maximum is raised in the future then it will. In any case the L suffix on 1024 to widen the type here just seems a bit wrong giving that postgresql supports platforms where sizeof(int) and sizeof(long) is the same. This is not a bug. Note the limitation on the allowed range of work_mem in guc.c; that's designed precisely to ensure that work_mem * 1024L will not overflow a long. That's perhaps a bit klugy, but if we wanted to remove that assumption we'd have to touch far more places than what you have done here, and most of them would require much uglier changes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgcrypto missing header file inclusions
While playing around with the pginclude tools, I noticed that pgcrypto header files are failing to include some header files whose symbols they use. This change would fix it: diff --git a/contrib/pgcrypto/pgp.h b/contrib/pgcrypto/pgp.h index 3022abf..f856e07 100644 --- a/contrib/pgcrypto/pgp.h +++ b/contrib/pgcrypto/pgp.h @@ -29,6 +29,9 @@ * contrib/pgcrypto/pgp.h */ +#include mbuf.h +#include px.h + enum PGP_S2K_TYPE { PGP_S2K_SIMPLE = 0, Does that look reasonable? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
New version of the patch with updated documentation and which does not display the planning time when the COSTS are off. I will add it to the next commitfest. -- Andreas Karlsson diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml new file mode 100644 index 2af1738..240a3d5 *** a/doc/src/sgml/perform.sgml --- b/doc/src/sgml/perform.sgml *** EXPLAIN SELECT * FROM tenk1; *** 89,94 --- 89,95 QUERY PLAN - Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=244) + Planning time: 0.113 ms /screen /para *** EXPLAIN SELECT * FROM tenk1; *** 162,167 --- 163,175 /para para + The literalPlanning time/literal shown is the time it took to generate + the query plan from the parsed query and optimize it. It does not include + rewriting and parsing. We will not include this line in later examples in + this section. +/para + +para Returning to our example: screen *** WHERE t1.unique1 lt; 10 AND t1.unique2 *** 564,569 --- 572,578 Index Cond: (unique1 lt; 10) -gt; Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10) Index Cond: (unique2 = t1.unique2) + Planning time: 0.181 ms Total runtime: 0.501 ms /screen *** WHERE t1.unique1 lt; 100 AND t1.unique2 *** 612,617 --- 621,627 Recheck Cond: (unique1 lt; 100) -gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.049..0.049 rows=100 loops=1) Index Cond: (unique1 lt; 100) + Planning time: 0.194 ms Total runtime: 8.008 ms /screen *** EXPLAIN ANALYZE SELECT * FROM tenk1 WHER *** 635,640 --- 645,651 Seq Scan on tenk1 (cost=0.00..483.00 rows=7000 width=244) (actual time=0.016..5.107 rows=7000 loops=1) Filter: (ten lt; 7) Rows Removed by Filter: 3000 + Planning time: 0.083 ms Total runtime: 5.905 ms /screen *** EXPLAIN ANALYZE SELECT * FROM polygon_tb *** 657,662 --- 668,674 Seq Scan on polygon_tbl (cost=0.00..1.05 rows=1 width=32) (actual time=0.044..0.044 rows=0 loops=1) Filter: (f1 @gt; '((0.5,2))'::polygon) Rows Removed by Filter: 4 + Planning time: 0.040 ms Total runtime: 0.083 ms /screen *** EXPLAIN ANALYZE SELECT * FROM polygon_tb *** 675,680 --- 687,693 Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=32) (actual time=0.062..0.062 rows=0 loops=1) Index Cond: (f1 @gt; '((0.5,2))'::polygon) Rows Removed by Index Recheck: 1 + Planning time: 0.034 ms Total runtime: 0.144 ms /screen *** EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM *** 705,710 --- 718,724 -gt; Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.227..0.227 rows=999 loops=1) Index Cond: (unique2 gt; 9000) Buffers: shared hit=5 + Planning time: 0.088 ms Total runtime: 0.423 ms /screen *** EXPLAIN ANALYZE UPDATE tenk1 SET hundred *** 732,737 --- 746,752 Recheck Cond: (unique1 lt; 100) -gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.043..0.043 rows=100 loops=1) Index Cond: (unique1 lt; 100) + Planning time: 0.079 ms Total runtime: 14.727 ms ROLLBACK; *** EXPLAIN ANALYZE SELECT * FROM tenk1 WHER *** 817,822 --- 832,838 Index Cond: (unique2 gt; 9000) Filter: (unique1 lt; 100) Rows Removed by Filter: 287 + Planning time: 0.096 ms Total runtime: 0.336 ms /screen diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml new file mode 100644 index 35264dc..4e3f571 *** a/doc/src/sgml/ref/explain.sgml --- b/doc/src/sgml/ref/explain.sgml *** ROLLBACK; *** 145,151 para Include information on the estimated startup and total cost of each plan node, as well as the estimated number of rows and the estimated ! width of each row. This parameter defaults to literalTRUE/literal. /para /listitem /varlistentry --- 145,152 para Include information on the estimated startup and total cost of each plan node, as well as the estimated number of rows and the estimated ! width of each row. Additionally it controls if the planning time is ! displayed. This parameter defaults to literalTRUE/literal. /para /listitem /varlistentry *** EXPLAIN SELECT * FROM foo; *** 289,295 QUERY PLAN -
[HACKERS] Bogus error handling in pg_upgrade
A credulous person might suppose that this chunk of code is designed to abort if pg_resetxlog fails: prep_status(Setting next transaction ID for new cluster); exec_prog(UTILITY_LOG_FILE, NULL, true, \%s/pg_resetxlog\ -f -x %u \%s\, new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); check_ok(); In point of fact, it does no such thing, but blithely continues (even though pg_resetxlog has corrupted things horribly before failing). check_ok() is particularly badly named, since it contains not one iota of error checking. misleadingly_claim_ok() would be a better name. If this isn't broken-by-design, I'd like an explanation why not. In case you're wondering, I'm investigating the problem mentioned at 1387636762.30013.13.ca...@vanquo.pezone.net. I see this output: Performing Upgrade -- Analyzing all rows in the new cluster ok Freezing all rows on the new clusterok Deleting files from new pg_clog ok Copying old pg_clog to new server ok Setting next transaction ID for new cluster *failure* Consult the last few lines of pg_upgrade_utility.log for the probable cause of the failure. ok Deleting files from new pg_multixact/offsetsok Copying old pg_multixact/offsets to new server ok Deleting files from new pg_multixact/membersok Copying old pg_multixact/members to new server ok Setting next multixact ID and offset for new clusterok Resetting WAL archives *failure* Consult the last few lines of pg_upgrade_utility.log for the probable cause of the failure. ok *failure* Consult the last few lines of pg_upgrade_server_start.log or pg_upgrade_server.log for the probable cause of the failure. connection to database failed: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket /Users/tgl/pgsql/contrib/pg_upgrade/.s.PGSQL.57632? could not connect to new postmaster started with the command: /Users/tgl/pgsql/contrib/pg_upgrade/tmp_check/install//Users/tgl/testversion/bin/pg_ctl -w -l pg_upgrade_server.log -D /Users/tgl/pgsql/contrib/pg_upgrade/tmp_check/data -o -p 57632 -b -c synchronous_commit=off -c fsync=off -c full_page_writes=off -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/Users/tgl/pgsql/contrib/pg_upgrade' start Failure, exiting make: *** [check] Error 1 I think the actual problem is that pg_resetxlog rewrites pg_control, zaps everything in pg_xlog/, and then fails before writing a new initial xlog segment. However, pg_upgrade isn't making this any easier to investigate by failing to stop at the first sign of trouble. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Tue, Dec 24, 2013 at 11:38 PM, Fujii Masao masao.fu...@gmail.com wrote: On Sun, Dec 22, 2013 at 8:32 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Sun, Dec 22, 2013 at 3:01 PM, Fujii Masao masao.fu...@gmail.com wrote: I found the bug of ALTER SYSTEM SET patch. The procedure to reproduce it is here. $ psql =# ALTER SYSTEM SET shared_buffers = '512MB'; ALTER SYSTEM =# \q $ pg_ctl -D data reload server signaled 2013-12-22 18:24:13 JST LOG: received SIGHUP, reloading configuration files 2013-12-22 18:24:13 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-22 18:24:13 JST LOG: configuration file X?? contains errors; unaffected changes were applied Your analysis is absolutely right. The reason for this issue is that we want to display filename to which erroneous parameter belongs and in this case we are freeing the parameter info before actual error. To fix, we need to save the filename value before freeing parameter list. Please find the attached patch to fix this issue. When I read ProcessConfigFile() more carefully, I found another related problem. The procedure to reproduce the problem is here. $ psql =# ALTER SYSTEM SET shared_buffers = '256MB'; =# \q $ echo shared_buffers = '256MB' $PGDATA/postgresql.conf $ pg_ctl -D $PGDATA reload 2013-12-25 03:05:44 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-25 03:05:44 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-25 03:05:44 JST LOG: configuration file /dav/alter-system/data/postgresql.auto.conf contains errors; unaffected changes were applied Both postgresql.conf and postgresql.auto.conf contain the setting of the parameter that cannot be changed without restarting the server. But only postgresql.auto.conf was logged, and which would be confusing, I'm afraid. The reason for above behaviour is that while applying configuration parameters, it just note down the last file which has error and ignore others. Now if we just want to change behaviour for above case that it display both the files, then during processing of parameters, we can save the errorconffile if it is different from previous. However, let us say if user uses include mechanism of conf files and used new file for specifying some of the changed parameters, then again similar thing can be observed. I think here basic idea is user should avoid using multiple methods to change configuration parameters, however we cannot stop them from doing the same. So in some cases like above where user use multiple methods to change configuration can add more work for him to understand and detect the error. Also, some similar behaviour can be observed if user uses include mechanism to specify some config parameters without even Alter System changes. I think that this problem should be fixed together with the problem that I reported before. Thought? Here I think you might be worried that if we want to fix the behaviour reported in this mail, it might overlap with the changes of previous fix. So I think let us first decide if we want to keep the current behaviour as it is or would like to change it and if it needs change, what should be the new behaviour? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Fri, Dec 27, 2013 at 5:01 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Dec 24, 2013 at 10:08 AM, Fujii Masao masao.fu...@gmail.com wrote: When I read ProcessConfigFile() more carefully, I found another related problem. The procedure to reproduce the problem is here. $ psql =# ALTER SYSTEM SET shared_buffers = '256MB'; =# \q $ echo shared_buffers = '256MB' $PGDATA/postgresql.conf $ pg_ctl -D $PGDATA reload 2013-12-25 03:05:44 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-25 03:05:44 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-25 03:05:44 JST LOG: configuration file /dav/alter-system/data/postgresql.auto.conf contains errors; unaffected changes were applied Both postgresql.conf and postgresql.auto.conf contain the setting of the parameter that cannot be changed without restarting the server. But only postgresql.auto.conf was logged, and which would be confusing, I'm afraid. I think that this problem should be fixed together with the problem that I reported before. Thought? I have run into this problem on many occasions because my benchmark scripts usually append a hunk of stuff to postgresql.conf, and any parameters that were already set generate this warning every time you reload, because postgresql.conf now mentions that parameter twice. I'm not quite sure what other problem you want to fix it together with, The 2 problems are: 1. First is that if user changes the config parameter (for ex. shared_buffers) both by manually editing postgresql.conf and by using Alter System command and then reload the config, it will show the message for parameter 'shared_buffers' twice, but will only show the last file where it exists. The detailed description of problem is in current mail. 2. The other problem is in some cases the name of file it display is junk, problem and fix can be found at link: http://www.postgresql.org/message-id/CAA4eK1+-yD2p=+6tdj_xpruhn4m_ckvfr51ah0gv3sxlgoa...@mail.gmail.com and I'm not sure what the right fix is either, but +1 for coming up with some solution that's better than what we have now. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Attached revision only uses heavyweight page locks across complex operations. I haven't benchmarked it, but it appears to perform reasonably well. I haven't attempted to measure a regression for regular insertion, but offhand it seems likely that any regression would be well within the noise - more or less immeasurably small. I won't repeat too much of what is already well commented in the patch. For those that would like a relatively quick summary of what I've done, I include inline a new section that I've added to the nbtree README: Notes about speculative insertion - As an amcanunique AM, the btree implementation is required to support speculative insertion. This means that the value locking method through which unique index enforcement conventionally occurs is extended and generalized, such that insertion is staggered: the core code attempts to get full consensus on whether values proposed for insertion will not cause duplicate key violations. Speculative insertion is only possible for unique index insertion without deferred uniqueness checking (since speculative insertion into a deferred unique constraint's index is a contradiction in terms). For conventional unique index insertion, the Btree implementation exclusive locks a buffer holding the first page that the value to be inserted could possibly be on, though only for an instant, during and shortly after uniqueness verification. It would not be acceptable to hold this lock across complex operations for the duration of the remainder of the first phase of speculative insertion. Therefore, we convert this exclusive buffer lock to an exclusive page lock managed by the lock manager, thereby greatly ameliorating the consequences of undiscovered deadlocking implementation bugs (though deadlocks are not expected), and minimizing the impact on system interruptibility, while not affecting index scans. It may be useful to informally think of the page lock type acquired by speculative insertion as similar to an intention exclusive lock, a type of lock found in some legacy 2PL database systems that use multi-granularity locking. A session establishes the exclusive right to subsequently establish a full write lock, without actually blocking reads of the page unless and until a lock conversion actually occurs, at which point both reads and writes are blocked. Under this mental model, buffer shared locks can be thought of as intention shared locks. As implemented, these heavyweight locks are only relevant to the insertion case; at no other point are they actually considered, since insertion is the only way through which new values are introduced. The first page a value proposed for insertion into an index could be on represents a natural choke point for our extended, though still rather limited system of value locking. Naturally, when we perform a lock escalation and acquire an exclusive buffer lock, all other buffer locks on the same buffer are blocked, which is how the implementation localizes knowledge about the heavyweight lock to insertion-related routines. Apart from deletion, which is concomitantly prevented by holding a pin on the buffer throughout, all exclusive locking of Btree buffers happen as a direct or indirect result of insertion, so this approach is sufficient. (Actually, an exclusive lock may still be acquired without insertion to initialize a root page, but that hardly matters.) Note that all value locks (including buffer pins) are dropped immediately as speculative insertion is aborted, as the implementation waits on the outcome of another xact, or as insertion proper occurs. These page-level locks are not intended to last more than an instant. In general, the protocol for heavyweight locking Btree pages is that heavyweight locks are acquired before any buffer locks are held, while the locks are only released after all buffer locks are released. While not a hard and fast rule, presently we avoid heavyweight page locking more than one page per unique index concurrently. Happy new year -- Peter Geoghegan btreelock_insert_on_dup.v5.2013_12_28.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On Sun, Dec 29, 2013 at 4:51 AM, Alexander Korotkov aekorot...@gmail.comwrote: I've compiled it with clang. Yeah, there was mixed declarations. I've rechecked it with gcc, now it gives no warnings. I didn't try it with visual studio, but I hope it will be OK. Thanks for the patch. It now compiles without any problems. I've been doing a bit of testing with the patch testing a few different workloads. One thing that I've found is that in my test case when the table only contains 1 tuple for any given presort columns that the query is actually slower than when there are say 100 tuples to sort for any given presort group. Here is my test case: DROP TABLE IF EXISTS temperature_readings; CREATE TABLE temperature_readings ( readingid SERIAL NOT NULL, timestamp TIMESTAMP NOT NULL, locationid INT NOT NULL, temperature INT NOT NULL, PRIMARY KEY (readingid) ); INSERT INTO temperature_readings (timestamp,locationid,temperature) SELECT ts.timestamp, loc.locationid, -10 + random() * 40 FROM generate_series('1900-04-01','2000-04-01','1 day'::interval) ts(timestamp) CROSS JOIN generate_series(1,1) loc(locationid); VACUUM ANALYZE temperature_readings; -- Warm buffers SELECT AVG(temperature) FROM temperature_readings; explain (buffers, analyze) select * from temperature_readings order by timestamp,locationid; -- (seqscan - sort) 70.805ms -- create an index to allow presorting on timestamp. CREATE INDEX temperature_readings_timestamp_idx ON temperature_readings (timestamp); -- warm index buffers SELECT COUNT(*) FROM (SELECT timestamp FROM temperature_readings ORDER BY timestamp) c; explain (buffers, analyze) select * from temperature_readings order by timestamp,locationid; -- index scan - partial sort 253.032ms The first query without the index to presort on runs in 70.805 ms, the 2nd query uses the index to presort and runs in 253.032 ms. I ran the code through a performance profiler and found that about 80% of the time is spent in tuplesort_end and tuplesort_begin_heap. If it was possible to devise some way to reuse any previous tuplesortstate perhaps just inventing a reset method which clears out tuples, then we could see performance exceed the standard seqscan - sort. The code the way it is seems to lookup the sort functions from the syscache for each group then allocate some sort space, so quite a bit of time is also spent in palloc0() and pfree() If it was not possible to do this then maybe adding a cost to the number of sort groups would be better so that the optimization is skipped if there are too many sort groups. Regards David Rowley -- With best regards, Alexander Korotkov.
Re: [HACKERS] [COMMITTERS] pgsql: Upgrade to Autoconf 2.69
Peter Eisentraut pete...@gmx.net writes: On Fri, 2013-12-20 at 10:54 -0300, Alvaro Herrera wrote: Evidently something is not going well in ReadRecord. It should have reported the read failure, but didn't. That seems a separate bug that needs fixed. This is enabling large-file support on OS X, so that seems kind of important. It's not failing with newer versions of OS X, so that leaves the following possibilities, I think: [ gets out ancient PPC laptop ... ] [ man, this thing is slow ... ] [ hours later ... ] There are three or four different bugs here, but the key points are: 1. pg_resetxlog is diligently trashing every single WAL file in pg_xlog/, and then failing (by virtue of some ancient OS X bug in readdir()), so that it doesn't get to the point of recreating a WAL file with a checkpoint record. 2. pg_resetxlog already rewrote pg_control, which might not be such a hot idea; but whether it had or not, pg_control's last-checkpoint pointer would be pointing to a WAL file that is not in pg_xlog/ now. 3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going. 4. The server tries to start, and fails because it can't find a WAL file containing the last checkpoint record. This is pretty unsurprising given the facts above. The reason you don't see any no such file report is that XLogFileRead() will report any BasicOpenFile() failure *other than* ENOENT. And nothing else makes up for that. Re point 4: the logic, if you can call it that, in xlog.c and xlogreader.c is making my head spin. There are about four levels of overcomplicated and undercommented code before you ever get down to XLogFileRead, so I have no idea which level to blame for the lack of error reporting in this specific case. But there are pretty clearly some cases in which ignoring ENOENT in XLogFileRead isn't such a good idea, and XLogFileRead isn't being told when to do that or not. Re point 3: I already bitched about that. Re point 2: I wonder whether pg_resetxlog shouldn't do things in a different order. Rewriting pg_control to point to a file that doesn't exist yet doesn't sound great. I'm not sure though if there's any great improvement to be had here; basically, if we fail partway through, we're probably screwed no matter what. Re point 1: there are threads in our archives suggesting that EINVAL from readdir() after unlinking a directory entry is a known problem on some versions of OS X, eg http://www.postgresql.org/message-id/flat/47c45b07-8459-48d8-8fbe-291864019...@me.com and I also find stuff on the intertubes suggesting that renaming an entry can cause it too, eg http://www.dovecot.org/list/dovecot/2009-July/041009.html We thought at the time that the readdir bug was new in OS X 10.6, but I'll bet it was there in 10.5's 64-bit-inode support and was only exposed to default builds when 10.6 turned on 64-bit-inodes in userland by default. So Apple fixed it in 10.6.2 but never back-patched the fix into 10.5.x --- shame on them. I'm disinclined to contort our stuff to any great extent to fix it, because all the OS X versions mentioned are several years obsolete. Perhaps though we should override Autoconf's setting of _DARWIN_USE_64_BIT_INODE, if we can do that easily? It's clearly not nearly as problem-free on 10.5 as the Autoconf boys believe, and it's already enabled by default on the release series where it does work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers