Re: [HACKERS] CREATE POLICY and RETURNING
On 10/17/2014 02:49 AM, Robert Haas wrote: I think you could probably make the DELETE policy control what can get deleted, but then have the SELECT policy further filter what gets returned. That seems like the worst of both worlds to me. Suddenly DELETE ... RETURNING might delete more rows than it reports a resultset for. As well as being potentially dangerous for people using it in wCTEs, etc, to me that's the most astonishing possible outcome of all. I'd be much happier with even: ERROR: RETURNING not permitted with SELECT row-security policy than this. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] printing table in asciidoc with psql
Hi Szymon I found a small bug - it doesn't escape | well postgres=# select * from mytab ; a | numeric_b | c --+---+ Ahoj |10 | 2014-10-17 Hello|20 | 2014-10-18 Hi |30 | 2014-10-19 aaa| | | 2014-10-17 (4 rows) result [options=header,cols=literal,literal,literal,frame=all,grid=all] | ^| +++a+++ ^| +++numeric_b+++ ^| +++c+++ | Ahoj | 10 | 2014-10-17 | Hello | 20 | 2014-10-18 | Hi | 30 | 2014-10-19 | aaa| | | 2014-10-17 | Next, I tested it with asciidoc and asciidoctor and I have a problem with asciidoctor - it doesn't respect aligning .. so numbers are aligned to left instead to right. When you use a option header then a formatting +++ is useless. Regards Pavel 2014-09-17 21:26 GMT+02:00 Szymon Guz mabew...@gmail.com: On 17 September 2014 19:55, Szymon Guz mabew...@gmail.com wrote: On 17 September 2014 19:30, Peter Eisentraut pete...@gmx.net wrote: On 9/16/14 3:52 PM, Szymon Guz wrote: It's not finished yet, I'm not sure it there is any sense in supporting border types etc. AFAICT, Asciidoc doesn't support border types, so (if so) you should just ignore that setting. Too late, I've done something like this: border=0 [frame=none,grid=none] border=1 [frame=all,grid=none] border=2 [frame=all,grid=all] thanks, Szymon Hi, thanks for all the remarks. I've attached another version of this patch. I think it's done. - This works: `\pset format asciidoc` - Output is formatted as asciidoc tables. - There is support for borders {0,1,2}. The attached html file was made by running tests for psql, taking the asciidoc tables from it, converting to html with `asciidoc file`. -- border = 0 - [frame=none,grid=none] -- border = 1 - [frame=none,grid=all] -- border = 2 - [frame=all,grid=all] - There are also tests. -- For normal and extended mode combined with each of the border values. -- With column names made of characters which need escaping -- With values: (with escape needed characters, string '11' and integer 11 - they should have different right-left alignment). - Documentation for psql is updated. - According to Emanuel's advice: help.c is updated. The attached html file contains tables from the test in this order: normal, border 0 normal, border 1 normal, border 2 expanded, border 0 expanded, border 1 expanded, border 2 regards, Szymon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - Assert statement
On 16/10/14 13:29, Pavel Stehule wrote: Hi 2014-10-14 22:57 GMT+02:00 Petr Jelinek p...@2ndquadrant.com Short review of the patch. Note that this has nothing to do with actual assertions, at the moment it's just enhancement of RAISE statement to make it optionally conditional. As I was one of the people who voted for it I do think we want this and I like the syntax. Code applies cleanly, seems formatted according to project standards - there is unnecessary whitespace added in variable declaration part of exec_stmt_raise which should be removed. fixed Passes make check, I would prefer to have little more complex expression than just true in the new regression test added for this feature. fixed Did some manual testing, seems to work as advertised. There are no docs at the moment which is the only show-stopper that I can see so far. fixed Great, looks good to me, marking as ready for committer. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE POLICY and RETURNING
Robert Haas wrote That's an argument in favour of only applying a read-filtering policy where a RETURNING clause is present, but that introduces the surprise! the effects of your DELETE changed based on an unrelated clause! issue. No- if we were going to do this, I wouldn't want to change the existing structure but rather provide either: a) a way to simply disable RETURNING if the policy is in effect and the policy creator doesn't wish to allow it b) allow the user to define another clause which would be applied to the rows in the RETURNING set I think you could probably make the DELETE policy control what can get deleted, but then have the SELECT policy further filter what gets returned. Without commenting on the usefulness of blind deletes... How about returning a placeholder row but with all the values replaced with NULL? In the absence of returning does the delete count show the total number of rows deleted or only the number of rows deleted that the user would be aware of if they issued a select with the same criteria? Whatever the answer the number of rows returned with returning should match the row count normally noted. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/CREATE-POLICY-and-RETURNING-tp5823192p5823374.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE POLICY and RETURNING
On 17 October 2014 07:57, Craig Ringer cr...@2ndquadrant.com wrote: On 10/17/2014 02:49 AM, Robert Haas wrote: I think you could probably make the DELETE policy control what can get deleted, but then have the SELECT policy further filter what gets returned. That seems like the worst of both worlds to me. Suddenly DELETE ... RETURNING might delete more rows than it reports a resultset for. As well as being potentially dangerous for people using it in wCTEs, etc, to me that's the most astonishing possible outcome of all. I'd be much happier with even: ERROR: RETURNING not permitted with SELECT row-security policy than this. +1 This suggestion is most in line with what I would expect to occur. Thom
Re: [HACKERS] Materialized views don't show up in information_schema
2014-10-16 Stephen Frost sfr...@snowman.net: Alright, coming back to this, I have to ask- how are matviews different from views from the SQL standard's perspective? Matviews that are always up to date when you access them are semantically exactly the same as normal views. Matviews that can get out of date, however, are not. Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback
In synchronous mode, pg_receivexlog should have similar logic as walreceiver does. OK. I understand that removing --fsync-interval has no problem. +1 for adding something like --synchronous option. To me, it sounds walreceiver-compatible mode rather than synchronous. Better to add a new notify message type. And pg_recevexlog should be prepared to receive it at any time. The status might change on the fly, if the server's configuration is reloaded. OK. I'll consider it. I don't think that's good idea because it prevents us from using pg_receivexlog as async walreceiver (i.e., received WAL data is fsynced and feedback is sent back to the server soon, but transaction commit in the server doesn't wait for the feedback). Sync rep works by setting parameters on the master. Standby servers send replies by default, though you can turn replies off. pg_receivexlog should work the same, but can't do this because it doesn't know the fsync position unless it fsyncs. So its not appropriate to have an option called --synchronous in the same way that there is no parameter called synchronous on the standby, for good reason. A new parameter to send feedback should be called --feedback A second parameter to decide whether to fsync should be called --fsync if (feedback fsync) send fsynced LSN else if (feedback) send received LSN ; /* else send no feedback */ Thanks for the comment. The patch cannot be applied to HEAD cleanly so I updated. So its not appropriate to have an option called --synchronous in the same way that there is no parameter called synchronous on the standby, for good reason. In case of gathering options to one option, change the name --synchronous to other name solves the problem ? A new parameter to send feedback should be called --feedback A second parameter to decide whether to fsync should be called --fsync I think keep using --reply-fsync and --fsync-interval is better than make new options. Thought? Regards, -- Furuya Osamu pg_receivexlog-fsync-feedback-v6.patch Description: pg_receivexlog-fsync-feedback-v6.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views don't show up in information_schema
Nicolas Barbier wrote 2014-10-16 Stephen Frost lt; sfrost@ gt;: Alright, coming back to this, I have to ask- how are matviews different from views from the SQL standard's perspective? Matviews that are always up to date when you access them are semantically exactly the same as normal views. Matviews that can get out of date, however, are not. Materialized Views share features and properties of both tables and views - and omit capabilities available to both as well. The performance optimization spoken of is basically the table aspect of the feature while the backing query makes it look like a view. But all the while it is a distinct feature and one not described in the SQL standard. From a read-only perspective I can see the value of having this particular row-source available in the standard information schema but anything trying to manipulate a matview as either a view or a table will be surprised. Since the standard doesn't distinguish between read and write aspects of the object types there isn't a safe way to add matviews to the information schema that doesn't violate the intent of the provided view. If the application/users wants to support/use PostgreSQL specific features it/they have to be ready and able to use the catalog. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Materialized-views-don-t-show-up-in-information-schema-tp5822643p5823379.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve automatic analyze messages for inheritance trees
(2014/10/16 17:17), Simon Riggs wrote: On 16 October 2014 06:49, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: How about this? automatic analyze of table \%s.%s.%s\ as inheritance tree Thank you for the comment. Would it be useful to keep track of how many tables just got analyzed? i.e. analyze of foo (including N inheritance children) I think that's a good idea. So, I'll update the patch. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] detect custom-format dumps in psql and emit a useful error
Hi, Regarding Loading Custom Format Dump: === When we supply plain sql file to pg_restore, we get following error: $ ./install/bin/pg_restore a.sql pg_restore: [archiver] input file does not appear to be a valid archive So I would expect similar kind of message when we provide non-plain sql file to psql. Something like: input file does not appear to be a valid sql script file (use pg_restore instead) I have added additional details in parenthesis as we correctly identified it as a custom dump file and user wanted it to restore. However I do not see any issue with the patch. Regarding Directory Error: === I strongly against the proposal. This patch changing error message to something like this: psql:blah:0: Input path is a directory. Use pg_restore to restore directory-format database dumps. So even though I accidentally provide a directory instead of a sql script file when I have NO intention of restoring a dump, above message looks weired. Instead current message looks perfectly fine here. i.e. could not read from input file: Is a directory psql always expect a file and NOT directory. Also it is not necessarily working on restoring a dump. Thanks On Fri, Oct 17, 2014 at 9:41 AM, Craig Ringer cr...@2ndquadrant.com wrote: On 09/18/2014 05:58 PM, Andres Freund wrote: I don't think we need to make any discinction between psql -f mydb.dump, psql mydb.dump, and whatever | psql. Just check, when noninteractively reading the first line in mainloop.c:MainLoop(), whether it starts with the magic header. That'd also trigger the warning on \i pg_restore_file, but that's hardly a problem. Done, patch attached. If psql sees that the first line begins with PGDMP it'll emit: The input is a PostgreSQL custom-format dump. Use the pg_restore command-line client to restore this dump to a database. then discard the rest of the current input source. pg_restore already knows to tell you to use psql if it sees an SQL file as input. Having something similar for pg_dump would be really useful. Agreed. We could additionally write out a hint whenever a directory is fed to psql -f that psql can't be used to read directory type dumps. Unlike the confusion between pg_restore and psql for custom file format dumps I haven't seen people getting this one muddled. Perhaps directory format dumps are just a bit more niche, or perhaps it's just more obvious that: psql:sometump:0: could not read from input file: Is a directory ... means psql is the wrong tool. Still, separate patch attached. psql will now emit: psql:blah:0: Input path is a directory. Use pg_restore to restore directory-format database dumps. I'm less sure that this is a worthwhile improvement than the check for PGDMP and custom format dumps though. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: +91 20 30589500 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
[HACKERS] [PATCH] add ssl_protocols configuration option
The attached patches add an ssl_protocols configuration option which control which versions of SSL or TLS the server will use. The syntax is similar to Apache's SSLProtocols directive, except that the list is colon-separated instead of whitespace-separated, although that is easy to change if it proves unpopular. Summary of the patch: - In src/backend/libpq/be-secure.c: - Add an SSLProtocols variable for the option. - Add a function, parse_SSL_protocols(), that parses an ssl_protocols string and returns a bitmask suitable for SSL_CTX_set_options(). - Change initialize_SSL() to call parse_SSL_protocols() and pass the result to SSL_CTX_set_options(). - In src/backend/utils/misc/guc.c: - Add an extern declaration for SSLProtocols. - Add an entry in the ConfigureNamesString array for the ssl_protocols option. - In src/backend/utils/misc/postgresql.conf.sample: - Add a sample ssl_protocols line. - In doc/src/sgml/config.sgml: - Document the ssl_protocols option. The file names are slightly different in 9.5, since be-secure.c was split in two and the declaration was moved into libpq.h. The default is ALL:-SSLv2 in 9.0-9.3 and ALL:-SSL in 9.4 and up. This corresponds to the current hardcoded values, so the default behavior is unchanged, but the admin now has the option to select a different settings, e.g. if a serious vulnerability is found in TLS 1.0. diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 6ee17d8..7233a73 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1027,6 +1027,34 @@ include_dir 'conf.d' /listitem /varlistentry + varlistentry id=guc-ssl-protocols xreflabel=ssl_protocols + termvarnamessl_protocols/varname (typestring/type)/term + indexterm + primaryvarnamessl_protocols/ configuration parameter/primary + /indexterm + listitem + para +Specifies a colon-separated list of acronymSSL/ protocols that are +allowed to be used on secure connections. Each entry in the list can +be prefixed by a literal+/ (add to the current list) +or literal-/ (remove from the current list). If no prefix is used, +the list is replaced with the specified protocol. + /para + para +The full list of supported protocols can be found in the +the applicationopenssl/ manual page. In addition to the names of +individual protocols, the following keywords can be +used: literalALL/ (all protocols supported by the underlying +crypto library), literalSSL/ (all supported versions +of acronymSSL/) and literalTLS/ (all supported versions +of acronymTLS/). + /para + para +The default is literalALL:-SSL/literal. + /para + /listitem + /varlistentry + varlistentry id=guc-ssl-ciphers xreflabel=ssl_ciphers termvarnamessl_ciphers/varname (typestring/type) indexterm diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index b05364c..f440b77 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -87,6 +87,7 @@ static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); static void initialize_ecdh(void); static const char *SSLerrmessage(void); +static long parse_SSL_protocols(const char *str); /* are we in the middle of a renegotiation? */ static bool in_ssl_renegotiation = false; @@ -245,15 +246,16 @@ be_tls_init(void) SSLerrmessage(; } - /* set up ephemeral DH keys, and disallow SSL v2/v3 while at it */ + /* set up ephemeral DH keys */ SSL_CTX_set_tmp_dh_callback(SSL_context, tmp_dh_cb); - SSL_CTX_set_options(SSL_context, - SSL_OP_SINGLE_DH_USE | - SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3); + SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_DH_USE); /* set up ephemeral ECDH keys */ initialize_ecdh(); + /* set up the allowed protocol list */ + SSL_CTX_set_options(SSL_context, parse_SSL_protocols(SSLProtocols)); + /* set up the allowed cipher list */ if (SSL_CTX_set_cipher_list(SSL_context, SSLCipherSuites) != 1) elog(FATAL, could not set the cipher list (no valid ciphers available)); @@ -1053,3 +1055,106 @@ SSLerrmessage(void) snprintf(errbuf, sizeof(errbuf), _(SSL error code %lu), errcode); return errbuf; } + + +/* + * Parse the SSL allowed protocol list + * + * The logic here is inverted. OpenSSL does not take a list of + * protocols to use, but a list of protocols to avoid. We use the + * same bits with the opposite meaning, then invert the result. + */ + +#define SSL_PROTO_SSLv2 SSL_OP_NO_SSLv2 +#define SSL_PROTO_SSLv3 SSL_OP_NO_SSLv3 +#define SSL_PROTO_SSL (SSL_PROTO_SSLv2 | SSL_PROTO_SSLv3) +#define SSL_PROTO_TLSv1 SSL_OP_NO_TLSv1 +#ifdef SSL_OP_NO_TLSv1_1 +#define SSL_PROTO_TLSv1_1 SSL_OP_NO_TLSv1_1 +#else +#define SSL_PROTO_TLSv1_1 0 +#endif +#ifdef SSL_OP_NO_TLSv1_2
Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback
On 17 October 2014 09:55, furu...@pm.nttdata.co.jp wrote: A new parameter to send feedback should be called --feedback A second parameter to decide whether to fsync should be called --fsync I think keep using --reply-fsync and --fsync-interval is better than make new options. Thought? We already have hot_standby_feedback, so using the name feedback is best idea. I am suggesting that we send feedback even if we do not fsync, to allow the master to track our progress. Hence the name of the second parameter was just fsync. So both names were suggested because of links to those terms already being used for similar reasons elsewhere in Postgres. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - Assert statement
2014-10-17 9:14 GMT+02:00 Petr Jelinek p...@2ndquadrant.com: On 16/10/14 13:29, Pavel Stehule wrote: Hi 2014-10-14 22:57 GMT+02:00 Petr Jelinek p...@2ndquadrant.com Short review of the patch. Note that this has nothing to do with actual assertions, at the moment it's just enhancement of RAISE statement to make it optionally conditional. As I was one of the people who voted for it I do think we want this and I like the syntax. Code applies cleanly, seems formatted according to project standards - there is unnecessary whitespace added in variable declaration part of exec_stmt_raise which should be removed. fixed Passes make check, I would prefer to have little more complex expression than just true in the new regression test added for this feature. fixed Did some manual testing, seems to work as advertised. There are no docs at the moment which is the only show-stopper that I can see so far. fixed Great, looks good to me, marking as ready for committer. Thank you very much Pavel -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] CREATE POLICY and RETURNING
On Fri, Oct 17, 2014 at 3:49 AM, Robert Haas robertmh...@gmail.com wrote: That's an argument in favour of only applying a read-filtering policy where a RETURNING clause is present, but that introduces the surprise! the effects of your DELETE changed based on an unrelated clause! issue. No- if we were going to do this, I wouldn't want to change the existing structure but rather provide either: a) a way to simply disable RETURNING if the policy is in effect and the policy creator doesn't wish to allow it b) allow the user to define another clause which would be applied to the rows in the RETURNING set I think you could probably make the DELETE policy control what can get deleted, but then have the SELECT policy further filter what gets returned. +1 That's more intuitive to me because another security feature privilege works in that way, i.e., SELECT privilege not DELETE controls RETURNING. Another minor problem that I found is that pg_dump always fails when there is a row-level policy for update. I think that the attached patch should be applied. Regards, -- Fujii Masao diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index c56a4cb..16ebc12 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -2939,7 +2939,7 @@ dumpRowSecurity(Archive *fout, DumpOptions *dopt, RowSecurityInfo *rsinfo) cmd = SELECT; else if (strcmp(rsinfo-rseccmd, a) == 0) cmd = INSERT; - else if (strcmp(rsinfo-rseccmd, u) == 0) + else if (strcmp(rsinfo-rseccmd, w) == 0) cmd = UPDATE; else if (strcmp(rsinfo-rseccmd, d) == 0) cmd = DELETE; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add ssl_protocols configuration option
On Fri, Oct 17, 2014 at 7:58 PM, Dag-Erling Smørgrav d...@des.no wrote: The default is ALL:-SSLv2 in 9.0-9.3 and ALL:-SSL in 9.4 and up. This corresponds to the current hardcoded values, so the default behavior is unchanged, but the admin now has the option to select a different settings, e.g. if a serious vulnerability is found in TLS 1.0. Please note that new features can only be added to the version currently in development, aka 9.5. You should as well register your patch to the current commit fest, I think you are still in time: https://commitfest.postgresql.org/action/commitfest_view?id=24 -- Michael
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.9
On Tue, Oct 14, 2014 at 11:34 AM, Amit Kapila amit.kapil...@gmail.com wrote: I am not sure why we are seeing difference even though running on same m/c with same configuration. I have tried many times, but I could not get the numbers you have posted above with HEAD, however now trying with the latest version [1] posted by you, everything seems to be fine at this workload. The data at higher client count is as below: HEAD – commit 494affb Shared_buffers=8GB; Scale Factor = 3000 Client Count/No. Of Runs (tps) 64 128 Run-1 271799 24 Run-2 274341 245207 Run-3 275019 252258 HEAD – commit 494affb + wait free lw_shared_v2 Shared_buffers=8GB; Scale Factor = 3000 Client Count/No. Of Runs (tps) 64 128 Run-1 286209 274922 Run-2 289101 274495 Run-3 289639 273633 So I am planning to proceed further with the review/test of your latest patch. According to me, below things are left from myside: a. do some basic tpc-b tests with patch b. re-review latest version posted by you I know that you have posted optimization into StrategyGetBuffer() in this thread, however I feel we can evaluate it separately unless you are of opinion that both the patches should go together. [1] http://www.postgresql.org/message-id/20141010111027.gc6...@alap3.anarazel.de With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] CREATE POLICY and RETURNING
* Fujii Masao (masao.fu...@gmail.com) wrote: Another minor problem that I found is that pg_dump always fails when there is a row-level policy for update. I think that the attached patch should be applied. Urgh. That was tested before but we switched the characters used and evidently missed that. :( Will fix, thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] CREATE POLICY and RETURNING
* Thom Brown (t...@linux.com) wrote: On 17 October 2014 07:57, Craig Ringer cr...@2ndquadrant.com wrote: On 10/17/2014 02:49 AM, Robert Haas wrote: I think you could probably make the DELETE policy control what can get deleted, but then have the SELECT policy further filter what gets returned. That seems like the worst of both worlds to me. Suddenly DELETE ... RETURNING might delete more rows than it reports a resultset for. As well as being potentially dangerous for people using it in wCTEs, etc, to me that's the most astonishing possible outcome of all. I'd be much happier with even: ERROR: RETURNING not permitted with SELECT row-security policy than this. +1 This suggestion is most in line with what I would expect to occur. This was along the lines that I've been thinking for how to address this also and I think it's the least surprising- but I want it controllable.. Thoughts on 'WITH RETURNING' / 'WITHOUT RETURNING' and what the default should be? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On Fri, Oct 17, 2014 at 1:31 AM, Simon Riggs si...@2ndquadrant.com wrote: On 16 October 2014 15:09, Amit Kapila amit.kapil...@gmail.com wrote: I think doing anything on the server side can have higher complexity like: a. Does this function return immediately after sending request to autovacuum, if yes then the behaviour of this new functionality will be different as compare to vacuumdb which user might not expect? b. We need to have some way to handle errors that can occur in autovacuum (may be need to find a way to pass back to user), vacuumdb or Vacuum can report error to user. c. How does nworkers input relates to autovacuum_max_workers which is needed at start for shared memory initialization and in calc of maxbackends. d. How to handle database wide vacuum which is possible via vacuumdb e. What is the best UI (a command like above or via config parameters)? c) seems like the only issue that needs any thought. I don't think its going to be that hard. I don't see any problems with the other points. You can make a function wait, if you wish. It is quite possible, but still I think to accomplish such a function, we need to have some mechanism where it can inform auto vacuum and then some changes in auto vacuum to receive/read that information and reply back. I don't think any such mechanism exists. I think we can find a way for above and may be if any other similar things needs to be taken care, but still I think it is better that we have this feature via vacuumdb rather than adding complexity in server code. Also the current algorithm used in patch is discussed and agreed upon in this thread and if now we want to go via some other method (auto vacuum), it might take much more time to build consensus on all the points. Well, I read Alvaro's point from earlier in the thread and agreed with it. All we really need is an instruction to autovacuum to say be aggressive. Just because somebody added something to the TODO list doesn't make it a good idea. I apologise to Dilip for saying this, it is not anything against him, just the idea. Perhaps we just accept the patch and change AV in the future. So shall we move ahead with review of this patch and make a note of changing AV in future (may be TODO)? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] CREATE POLICY and RETURNING
* David G Johnston (david.g.johns...@gmail.com) wrote: How about returning a placeholder row but with all the values replaced with NULL? I don't think that would be a good approach.. A user actually looking at those rows would be highly confused. In the absence of returning does the delete count show the total number of rows deleted or only the number of rows deleted that the user would be aware of if they issued a select with the same criteria? Whatever the answer the number of rows returned with returning should match the row count normally noted. Today, everything matches up, yes. Having rows which are deleted but which don't show up in RETURNING could certainly surprise people and applications, which is why I tend to favor the 'all-or-error' approach that others have also suggested. Adding that wouldn't be difficult, though we'd need to decide which should be the default. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
On Thu, Oct 9, 2014 at 3:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Oct 8, 2014 at 10:00 PM, Michael Paquier michael.paqu...@gmail.com wrote: The additional process at promotion sounds like a good idea, I'll try to get a patch done tomorrow. This would result as well in removing the XLogArchiveForceDone stuff. Either way, not that I have been able to reproduce the problem manually, things can be clearly solved. Please find attached two patches aimed to fix this issue and to improve the situation: - 0001 prevents the apparition of those phantom WAL segment file by ensuring that when a node is in recovery it will remove it whatever its status in archive_status. This patch is the real fix, and should be applied down to 9.2. - 0002 is a patch implementing Heikki's idea of enforcing all the segment files present in pg_xlog to have their status to .done, marking them for removal. When looking at the code, I finally concluded that Fujii-san's point, about marking in all cases as .done segment files that have been fully streamed, actually makes more sense to not be backward. This patch would actually not be mandatory for back-patching, but it makes the process more robust IMO. Thanks for the patches! I found one problem in the 0002 patch. The patch changes the recovery so that it creates .done files for every WAL files which exist in pg_xlog directory at the end of recovery. But even WAL files which will have to be archived later can exist in pg_xlog at that moment. For example, the latest, recycled and fully-written-but-not-archived-yet (i.e., maybe having .ready files) WAL files. The patch wrongly prevents them from being archvied at all. ISTM that the 0001 patch has the similar problem. Please imagine the following scenario. 1. There are many unarchived WAL files in pg_xlog because of the continuous failure of archive_command, for example, and then the server unfortunately crashes because of the corruption of database itself. 2. DBA restores the backup onto the server and copies all the WAL files from old pg_xlog to new one. Then he or she prepares for archive recovery. 3. DBA starts the server and the archive recovery starts. 4. After all the archived WAL files are replayed, the server tries to replay the WAL files in pg_xlog. Since there are many WAL files in pg_xlog, more than one restartpoints happen while they are being replayed. In this case, the patch seems to make the restartpoint recycle even WAL files which have .ready files and will have to be archived later. Thought? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE POLICY and RETURNING
On Fri, Oct 17, 2014 at 7:46 AM, Stephen Frost sfr...@snowman.net wrote: Thoughts on 'WITH RETURNING' / 'WITHOUT RETURNING' and what the default should be? That sounds like a horrendous pile of nasty complication for no good purpose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Vitesse DB call for testing
Hi everyone, Vitesse DB 9.3.5.S is Postgres 9.3.5 with a LLVM-JIT query executor designed for compute intensive OLAP workload. We have gotten it to a reasonable state and would like to open it up to the pg hackers community for testing and suggestions. Vitesse DB offers -- JIT Compilation for compute-intensive queries -- CSV parsing with SSE instructions -- 100% binary compatibility with PG9.3.5. Our results show CSV imports run up to 2X faster, and TPCH Q1 runs 8X faster. Our TPCH 1GB benchmark results is also available at http://vitessedata.com/benchmark/ . Please direct any questions by email to ck...@vitessedata.com . Thank you for your help. -- CK Tan Vitesse Data, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE POLICY and RETURNING
On 2014-10-17 14:57:03 +0800, Craig Ringer wrote: On 10/17/2014 02:49 AM, Robert Haas wrote: I think you could probably make the DELETE policy control what can get deleted, but then have the SELECT policy further filter what gets returned. That seems like the worst of both worlds to me. Suddenly DELETE ... RETURNING might delete more rows than it reports a resultset for. As well as being potentially dangerous for people using it in wCTEs, etc, to me that's the most astonishing possible outcome of all. I'd be much happier with even: ERROR: RETURNING not permitted with SELECT row-security policy FWIW, that doesn't sound acceptable to me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add ssl_protocols configuration option
Michael Paquier michael.paqu...@gmail.com writes: Please note that new features can only be added to the version currently in development, aka 9.5. I understand this policy. However, this new feature a) has absolutely no impact unless the admin makes a conscious decision to use it and b) will make life much easier for everyone if a POODLE-like vulnerability is discovered in TLS. You should as well register your patch to the current commit fest, I think you are still in time: https://commitfest.postgresql.org/action/commitfest_view?id=24 Thanks for reminding me. DES -- Dag-Erling Smørgrav - d...@des.no -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
Hi, On 2014-10-17 05:32:13 -0700, CK Tan wrote: Vitesse DB 9.3.5.S is Postgres 9.3.5 with a LLVM-JIT query executor designed for compute intensive OLAP workload. We have gotten it to a reasonable state and would like to open it up to the pg hackers community for testing and suggestions. Vitesse DB offers -- JIT Compilation for compute-intensive queries -- CSV parsing with SSE instructions -- 100% binary compatibility with PG9.3.5. Our results show CSV imports run up to 2X faster, and TPCH Q1 runs 8X faster. How are these modifications licensed? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 17 October 2014 12:52, Amit Kapila amit.kapil...@gmail.com wrote: It is quite possible, but still I think to accomplish such a function, we need to have some mechanism where it can inform auto vacuum and then some changes in auto vacuum to receive/read that information and reply back. I don't think any such mechanism exists. That's exactly how the CHECKPOINT command works, in conjunction with the checkpointer process. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Amit Kapila wrote: On Fri, Oct 17, 2014 at 1:31 AM, Simon Riggs si...@2ndquadrant.com wrote: On 16 October 2014 15:09, Amit Kapila amit.kapil...@gmail.com wrote: c) seems like the only issue that needs any thought. I don't think its going to be that hard. I don't see any problems with the other points. You can make a function wait, if you wish. It is quite possible, but still I think to accomplish such a function, we need to have some mechanism where it can inform auto vacuum and then some changes in auto vacuum to receive/read that information and reply back. I don't think any such mechanism exists. You're right, it doesn't. I think we have plenty more infrastructure for that than we had when autovacuum was initially developed. It shouldn't be that hard. Of course, this is a task that requires much more thinking, design, and discussion than just adding multi-process capability to vacuumdb ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
On Fri, Oct 17, 2014 at 9:23 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Oct 9, 2014 at 3:26 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Oct 8, 2014 at 10:00 PM, Michael Paquier michael.paqu...@gmail.com wrote: The additional process at promotion sounds like a good idea, I'll try to get a patch done tomorrow. This would result as well in removing the XLogArchiveForceDone stuff. Either way, not that I have been able to reproduce the problem manually, things can be clearly solved. Please find attached two patches aimed to fix this issue and to improve the situation: - 0001 prevents the apparition of those phantom WAL segment file by ensuring that when a node is in recovery it will remove it whatever its status in archive_status. This patch is the real fix, and should be applied down to 9.2. - 0002 is a patch implementing Heikki's idea of enforcing all the segment files present in pg_xlog to have their status to .done, marking them for removal. When looking at the code, I finally concluded that Fujii-san's point, about marking in all cases as .done segment files that have been fully streamed, actually makes more sense to not be backward. This patch would actually not be mandatory for back-patching, but it makes the process more robust IMO. Thanks for the patches! While reviewing the patch, I found another bug related to WAL file in recovery mode. The problem is that exitArchiveRecovery() always creates .ready file for the last WAL file of the old timeline even when it's restored from the archive and has .done file. So this causes the already-archived WAL file to be archived again Attached patch fixes this bug. Regards, -- Fujii Masao *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 5351,5368 exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo) * for the new timeline. * * Notify the archiver that the last WAL segment of the old timeline is ! * ready to copy to archival storage. Otherwise, it is not archived for a ! * while. */ if (endTLI != ThisTimeLineID) { XLogFileCopy(endLogSegNo, endTLI, endLogSegNo); ! if (XLogArchivingActive()) ! { ! XLogFileName(xlogpath, endTLI, endLogSegNo); ! XLogArchiveNotify(xlogpath); ! } } /* --- 5351,5367 * for the new timeline. * * Notify the archiver that the last WAL segment of the old timeline is ! * ready to copy to archival storage if its .done file doesn't exist ! * (e.g., if it's the restored WAL file, it's expected to have .done file). ! * Otherwise, it is not archived for a while. */ if (endTLI != ThisTimeLineID) { XLogFileCopy(endLogSegNo, endTLI, endLogSegNo); ! /* Create .ready file only when neither .ready nor .done files exist */ ! XLogFileName(xlogpath, endTLI, endLogSegNo); ! XLogArchiveCheckDone(xlogpath); } /* *** a/src/backend/access/transam/xlogarchive.c --- b/src/backend/access/transam/xlogarchive.c *** *** 516,521 XLogArchiveNotify(const char *xlog) --- 516,527 char archiveStatusPath[MAXPGPATH]; FILE *fd; + /* + * We should not create .ready file for already archived WAL file to + * prevent it from being archived again. + */ + Assert(XLogArchiveIsBusy(xlog)); + /* insert an otherwise empty file called XLOG.ready */ StatusFilePath(archiveStatusPath, xlog, .ready); fd = AllocateFile(archiveStatusPath, w); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
On Fri, Oct 17, 2014 at 7:32 AM, CK Tan ck...@vitessedata.com wrote: Hi everyone, Vitesse DB 9.3.5.S is Postgres 9.3.5 with a LLVM-JIT query executor designed for compute intensive OLAP workload. We have gotten it to a reasonable state and would like to open it up to the pg hackers community for testing and suggestions. Vitesse DB offers -- JIT Compilation for compute-intensive queries -- CSV parsing with SSE instructions -- 100% binary compatibility with PG9.3.5. Our results show CSV imports run up to 2X faster, and TPCH Q1 runs 8X faster. Our TPCH 1GB benchmark results is also available at http://vitessedata.com/benchmark/ . Please direct any questions by email to ck...@vitessedata.com . You offer a binary with 32k block size...what's the rationale for that? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
On Fri, Oct 17, 2014 at 9:23 PM, Fujii Masao masao.fu...@gmail.com wrote: In this case, the patch seems to make the restartpoint recycle even WAL files which have .ready files and will have to be archived later. Thought? The real problem currently is that it is possible to have a segment file not marked as .done during recovery when stream connection is abruptly cut when this segment is switched, marking it as .ready in archive_status and simply letting this segment in pg_xlog because it will neither be recycled nor removed. I have not been able to look much at this code these days, so I am not sure how invasive it would be in back-branches, but perhaps we should try to improve code such as when a segment file is switched and connection to the is cut, we guarantee that this file is completed and marked as .done. -- Michael
Re: [HACKERS] Vitesse DB call for testing
On Fri, Oct 17, 2014 at 8:14 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Oct 17, 2014 at 7:32 AM, CK Tan ck...@vitessedata.com wrote: Hi everyone, Vitesse DB 9.3.5.S is Postgres 9.3.5 with a LLVM-JIT query executor designed for compute intensive OLAP workload. We have gotten it to a reasonable state and would like to open it up to the pg hackers community for testing and suggestions. Vitesse DB offers -- JIT Compilation for compute-intensive queries -- CSV parsing with SSE instructions -- 100% binary compatibility with PG9.3.5. Our results show CSV imports run up to 2X faster, and TPCH Q1 runs 8X faster. Our TPCH 1GB benchmark results is also available at http://vitessedata.com/benchmark/ . Please direct any questions by email to ck...@vitessedata.com . You offer a binary with 32k block size...what's the rationale for that? (sorry for the double post) OK, I downloaded the ubuntu binary and ran your benchmarks (after making some minor .conf tweaks like disabling SSL). I then ran your benchmark (after fixing the typo) of the count/sum/avg test -- *and noticed a 95% reduction in runtime performance* which is really quite amazing IMNSHO. I also ran a select only test on small scale factor pgbench and didn't see any regression there -- in fact you beat stock by ~ 3% (although this could be measurement noise). So now you've got my attention. So, if you don't mind, quit being coy and explain how the software works and all the neat things it does and doesn't do. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma atri.j...@gmail.com wrote: On Wednesday, October 15, 2014, Marti Raudsepp ma...@juffo.org wrote: Hi On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma atri.j...@gmail.com wrote: Please find attached a patch which implements support for UPDATE table1 SET(*)=... I presume you haven't read Tom Lane's proposal and discussion about multiple column assignment in UPDATE: http://www.postgresql.org/message-id/1783.1399054...@sss.pgh.pa.us (Assigning all columns was also discussed there) And there's a WIP patch: http://www.postgresql.org/message-id/20930.1402931...@sss.pgh.pa.us Thanks for the links, but this patch only targets SET(*) case, which, if I understand correctly, the patch you mentioned doesn't directly handle (If I understand correctly, the target of the two patches is different). Yeah -- in fact, there was some discussion about this exact case. This patch solves a very important problem: when doing record operations to move data between databases with identical schema there's currently no way to 'update' in a generic way without building out the entire field list via complicated and nasty dynamic SQL. I'm not sure about the proposed syntax though; it seems a little weird to me. Any particular reason why you couldn't have just done: UPDATE table1 SET * = a,b,c, ... also, UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...); seems cleaner than the proposed syntax for row assignment. Tom objected though IIRC. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Fri, Oct 17, 2014 at 7:45 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma atri.j...@gmail.com wrote: On Wednesday, October 15, 2014, Marti Raudsepp ma...@juffo.org wrote: Hi On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma atri.j...@gmail.com wrote: Please find attached a patch which implements support for UPDATE table1 SET(*)=... I presume you haven't read Tom Lane's proposal and discussion about multiple column assignment in UPDATE: http://www.postgresql.org/message-id/1783.1399054...@sss.pgh.pa.us (Assigning all columns was also discussed there) And there's a WIP patch: http://www.postgresql.org/message-id/20930.1402931...@sss.pgh.pa.us Thanks for the links, but this patch only targets SET(*) case, which, if I understand correctly, the patch you mentioned doesn't directly handle (If I understand correctly, the target of the two patches is different). Yeah -- in fact, there was some discussion about this exact case. This patch solves a very important problem: when doing record operations to move data between databases with identical schema there's currently no way to 'update' in a generic way without building out the entire field list via complicated and nasty dynamic SQL. Thanks! I'm not sure about the proposed syntax though; it seems a little weird to me. Any particular reason why you couldn't have just done: UPDATE table1 SET * = a,b,c, ... also, UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...); I honestly have not spent a lot of time thinking about the exact syntax that may be acceptable. If we have arguments for or against a specific syntax, I will be glad to incorporate them.
Re: [HACKERS] Support UPDATE table SET(*)=...
On 10/17/14 4:15 PM, Merlin Moncure wrote: Any particular reason why you couldn't have just done: UPDATE table1 SET * = a,b,c, ... That just looks wrong to me. I'd prefer (*) = .. over that any day. UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...); seems cleaner than the proposed syntax for row assignment. Tom objected though IIRC. I don't know about Tom, but I didn't like that: http://www.postgresql.org/message-id/5364c982.7060...@joh.to .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja ma...@joh.to wrote: On 10/17/14 4:15 PM, Merlin Moncure wrote: Any particular reason why you couldn't have just done: UPDATE table1 SET * = a,b,c, ... That just looks wrong to me. I'd prefer (*) = .. over that any day. UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...); seems cleaner than the proposed syntax for row assignment. Tom objected though IIRC. I don't know about Tom, but I didn't like that: http://www.postgresql.org/message-id/5364c982.7060...@joh.to Hm, I didn't understand your objection: quoting So e.g.: UPDATE foo f SET f = ..; would resolve to the table, despite there being a column called f? That would break backwards compatibility. /quoting That's not correct: it should work exactly as 'select' does; given a conflict resolve the field name, so no backwards compatibility issue. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On 10/17/14 5:03 PM, Merlin Moncure wrote: Hm, I didn't understand your objection: quoting So e.g.: UPDATE foo f SET f = ..; would resolve to the table, despite there being a column called f? That would break backwards compatibility. /quoting That's not correct: it should work exactly as 'select' does; given a conflict resolve the field name, so no backwards compatibility issue. local:marko=# show server_version; server_version 9.1.13 (1 row) local:marko=#* create table foo(f int); CREATE TABLE local:marko=#* update foo f set f=1; UPDATE 0 This query would change meaning with your suggestion. I'm not saying it would be a massive problem in practice, but I think we should first consider options which don't break backwards compatibility, even if some consider them less clean. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 17 October 2014 14:05, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Of course, this is a task that requires much more thinking, design, and discussion than just adding multi-process capability to vacuumdb ... Yes, please proceed with this patch as originally envisaged. No more comments from me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
Merlin Moncure mmonc...@gmail.com writes: On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja ma...@joh.to wrote: I don't know about Tom, but I didn't like that: http://www.postgresql.org/message-id/5364c982.7060...@joh.to Hm, I didn't understand your objection: quoting So e.g.: UPDATE foo f SET f = ..; would resolve to the table, despite there being a column called f? That would break backwards compatibility. /quoting That's not correct: it should work exactly as 'select' does; given a conflict resolve the field name, so no backwards compatibility issue. The point is that it's fairly messy (and bug-prone) to have a syntax where we have to make an arbitrary choice between two reasonable interpretations. If you look back at the whole thread Marko's above-cited message is in, we'd considered a bunch of different possible syntaxes for this, and none of them had much support. The (*) idea actually is starting to look pretty good to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
Merlin Moncure mmonc...@gmail.com writes: On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma atri.j...@gmail.com wrote: Thanks for the links, but this patch only targets SET(*) case, which, if I understand correctly, the patch you mentioned doesn't directly handle (If I understand correctly, the target of the two patches is different). Yeah -- in fact, there was some discussion about this exact case. This patch solves a very important problem: when doing record operations to move data between databases with identical schema there's currently no way to 'update' in a generic way without building out the entire field list via complicated and nasty dynamic SQL. I'm not sure about the proposed syntax though; it seems a little weird to me. Any particular reason why you couldn't have just done: UPDATE table1 SET * = a,b,c, ... also, UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...); seems cleaner than the proposed syntax for row assignment. Tom objected though IIRC. That last proposal is no good because it would be ambiguous if the table contains a column by that name. The (*) idea actually seems not bad, since it's shorthand for a parenthesized column list. I'm not sure about the patch itself though --- in particular, it doesn't seem to me that it should be touching transformTargetList, since that doesn't have anything to do with expansion of multiassignments today. Probably this is a symptom of having chosen a poor representation of the construct in the raw grammar output. However, I've not exactly wrapped my head around what that representation is ... the lack of any comments explaining it doesn't help. A larger question is whether it's appropriate to do the *-expansion at parse time, or whether we'd need to postpone it to later in order to handle reasonable use-cases. Since we expand SELECT * at parse time (and are mandated to do so by the SQL spec, I believe), it seems consistent to do this at parse time as well; but perhaps there is a counter argument. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Fri, Oct 17, 2014 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja ma...@joh.to wrote: I don't know about Tom, but I didn't like that: http://www.postgresql.org/message-id/5364c982.7060...@joh.to Hm, I didn't understand your objection: quoting So e.g.: UPDATE foo f SET f = ..; would resolve to the table, despite there being a column called f? That would break backwards compatibility. /quoting That's not correct: it should work exactly as 'select' does; given a conflict resolve the field name, so no backwards compatibility issue. The point is that it's fairly messy (and bug-prone) to have a syntax where we have to make an arbitrary choice between two reasonable interpretations. If you look back at the whole thread Marko's above-cited message is in, we'd considered a bunch of different possible syntaxes for this, and none of them had much support. The (*) idea actually is starting to look pretty good to me. Hm, I'll take it then. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
Marko Tiikkaja ma...@joh.to writes: local:marko=#* create table foo(f int); CREATE TABLE local:marko=#* update foo f set f=1; UPDATE 0 This query would change meaning with your suggestion. I think it wouldn't; Merlin is proposing that f would be taken as the field name. A more realistic objection goes like this: create table foo(f int, g int); update foo x set x = (1,2); -- works alter table foo add column x int; update foo x set x = (1,2,3); -- no longer works It's not a real good thing if a column addition or renaming can so fundamentally change the nature of a query. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: Marko Tiikkaja ma...@joh.to writes: local:marko=#* create table foo(f int); CREATE TABLE local:marko=#* update foo f set f=1; UPDATE 0 This query would change meaning with your suggestion. I think it wouldn't; Merlin is proposing that f would be taken as the field name. A more realistic objection goes like this: create table foo(f int, g int); update foo x set x = (1,2); -- works alter table foo add column x int; update foo x set x = (1,2,3); -- no longer works It's not a real good thing if a column addition or renaming can so fundamentally change the nature of a query. That's exactly how SELECT works. I also dispute that the user should be surprised in such cases. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trailing comma support in SELECT statements
Pavel Stehule pavel.steh...@gmail.com wrote: do you know, so this feature is a proprietary and it is not based on ANSI/SQL? Any user, that use this feature and will to port to other database will hate it. I remember that Sybase ASE allowed a trailing comma within the parentheses of a table definition, which was handy. I checked on SQL Fiddle and found that MS SQL Server and MySQL both allow that, too; although Oracle does not. I'm not taking a position on whether we should allow this in PostgreSQL, but not having it is likely to annoy some users moving *to* PostgreSQL, while having it is likely to annoy some users moving *away* from PostgreSQL. None of the products I tried allowed a leading comma. I didn't test, and have no knowledge regarding, how other products treat extra commas elsewhere. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
Merlin Moncure mmonc...@gmail.com writes: On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think it wouldn't; Merlin is proposing that f would be taken as the field name. A more realistic objection goes like this: create table foo(f int, g int); update foo x set x = (1,2); -- works alter table foo add column x int; update foo x set x = (1,2,3); -- no longer works It's not a real good thing if a column addition or renaming can so fundamentally change the nature of a query. That's exactly how SELECT works. I also dispute that the user should be surprised in such cases. Well, the reason we have a problem in SELECT is that we support an unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that SELECT foo FROM foo could represent a whole-row selection is nowhere to be found in the SQL standard, for good reason IMO. But we've never had the courage to break cleanly with this Berkeley leftover and insist that people spell it SQL-style as SELECT ROW(foo.*) FROM foo. I'd just as soon not introduce the same kind of ambiguity into UPDATE if we have a reasonable alternative. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
Merlin, glad you tried it. We take the query plan exactly as given by the planner, decide whether to JIT or to punt depending on the cost. If we punt, it goes back to pg executor. If we JIT, and if we could not proceed (usually of some operators we haven't implemented yet), we again punt. Once we were able to generate the code, there is no going back; we call into LLVM to obtain the function entry point, and run it to completion. The 3% improvement you see in OLTP tests is definitely noise. The bigint sum,avg,count case in the example you tried has some optimization. We use int128 to accumulate the bigint instead of numeric in pg. Hence the big speed up. Try the same query on int4 for the improvement where both pg and vitessedb are using int4 in the execution. The speed up is really noticeable when the data type is nonvarlena. In the varlena cases, we still call into pg routines most of the times. Again, try the sum,avg,count query on numeric, and you will see what I mean. Also, we don't support UDF at the moment. So all queries involving UDF gets sent to pg executor. On your question of 32k page size, the rational is that some of our customers could be interested in a data warehouse on pg. 32k page size is a big win when all you do is seqscan all day long. We are looking for bug reports at these stage and some stress tests done without our own prejudices. Some test on real data in non prod setting on queries that are highly CPU bound would be ideal. Thanks, -cktan On Oct 17, 2014, at 6:43 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Oct 17, 2014 at 8:14 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Oct 17, 2014 at 7:32 AM, CK Tan ck...@vitessedata.com wrote: Hi everyone, Vitesse DB 9.3.5.S is Postgres 9.3.5 with a LLVM-JIT query executor designed for compute intensive OLAP workload. We have gotten it to a reasonable state and would like to open it up to the pg hackers community for testing and suggestions. Vitesse DB offers -- JIT Compilation for compute-intensive queries -- CSV parsing with SSE instructions -- 100% binary compatibility with PG9.3.5. Our results show CSV imports run up to 2X faster, and TPCH Q1 runs 8X faster. Our TPCH 1GB benchmark results is also available at http://vitessedata.com/benchmark/ . Please direct any questions by email to ck...@vitessedata.com . You offer a binary with 32k block size...what's the rationale for that? (sorry for the double post) OK, I downloaded the ubuntu binary and ran your benchmarks (after making some minor .conf tweaks like disabling SSL). I then ran your benchmark (after fixing the typo) of the count/sum/avg test -- *and noticed a 95% reduction in runtime performance* which is really quite amazing IMNSHO. I also ran a select only test on small scale factor pgbench and didn't see any regression there -- in fact you beat stock by ~ 3% (although this could be measurement noise). So now you've got my attention. So, if you don't mind, quit being coy and explain how the software works and all the neat things it does and doesn't do. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Fri, Oct 17, 2014 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think it wouldn't; Merlin is proposing that f would be taken as the field name. A more realistic objection goes like this: create table foo(f int, g int); update foo x set x = (1,2); -- works alter table foo add column x int; update foo x set x = (1,2,3); -- no longer works It's not a real good thing if a column addition or renaming can so fundamentally change the nature of a query. That's exactly how SELECT works. I also dispute that the user should be surprised in such cases. Well, the reason we have a problem in SELECT is that we support an unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that SELECT foo FROM foo could represent a whole-row selection is nowhere to be found in the SQL standard, for good reason IMO. But we've never had the courage to break cleanly with this Berkeley leftover and insist that people spell it SQL-style as SELECT ROW(foo.*) FROM foo. I'd just as soon not introduce the same kind of ambiguity into UPDATE if we have a reasonable alternative. Ah, interesting point (I happen to like the terse syntax and use it often). This is for posterity anyways since you guys seem to like Atri's proposal, which surprised me. However, I think you're over simplifying things here. Syntax aside: I think SELECT f FROM foo f; and a hypothetical SELECT row(f.*) FROM foo f; give different semantics. The former gives an object of type 'f' and the latter gives type 'row'. To get parity, you'd have to add an extra cast which means you'd have to play tricky games to avoid extra performance overhead besides being significantly more verbose. I'm aware some of the other QUELisms are pretty dodgy and have burned us in the past (like that whole function as record reference thing) but pulling a record as a field in select is pretty nice. It's also widely used and quite useful in json serialization. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
On Fri, Oct 17, 2014 at 10:47 AM, CK Tan ck...@vitessedata.com wrote: Merlin, glad you tried it. We take the query plan exactly as given by the planner, decide whether to JIT or to punt depending on the cost. If we punt, it goes back to pg executor. If we JIT, and if we could not proceed (usually of some operators we haven't implemented yet), we again punt. Once we were able to generate the code, there is no going back; we call into LLVM to obtain the function entry point, and run it to completion. The 3% improvement you see in OLTP tests is definitely noise. The bigint sum,avg,count case in the example you tried has some optimization. We use int128 to accumulate the bigint instead of numeric in pg. Hence the big speed up. Try the same query on int4 for the improvement where both pg and vitessedb are using int4 in the execution. The speed up is really noticeable when the data type is nonvarlena. In the varlena cases, we still call into pg routines most of the times. Again, try the sum,avg,count query on numeric, and you will see what I mean. Also, we don't support UDF at the moment. So all queries involving UDF gets sent to pg executor. On your question of 32k page size, the rational is that some of our customers could be interested in a data warehouse on pg. 32k page size is a big win when all you do is seqscan all day long. We are looking for bug reports at these stage and some stress tests done without our own prejudices. Some test on real data in non prod setting on queries that are highly CPU bound would be ideal. One thing that I noticed is that when slamming your benchmark query via pgbench, resident memory consumption was really aggressive and would have taken down the server had I not spuriously stopped the test. Memory consumption did return to baseline after that so I figured some type of llvm memory management games were going on. This isn't really a problem for most OLAP workloads but it's something to be aware of. Via 'perf top' on stock postgres, you see the usual suspects: palloc, hash_search_etc. On your build though HeapTuplesSatisfiesMVCC zooms right to the top of the stack which is pretty interesting...the executor is you've built is very lean and mean for sure. A drop in optimization engine with little no schema/sql changes is pretty neat -- your primary competitor here is going to be column organized type table solutions to olap type problems. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Oct 17, 2014 6:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: A more realistic objection goes like this: create table foo(f int, g int); update foo x set x = (1,2); -- works alter table foo add column x int; update foo x set x = (1,2,3); -- no longer works It's not a real good thing if a column addition or renaming can so fundamentally change the nature of a query. I think a significant use case for this feature is when you already have a row-value and want to persist it in the database, like you can do with INSERT: insert into foo select * from populate_record_json(null::foo, '{...}'); In this case the opposite is true: requiring explicit column names would break the query if you add columns to the table. The fact that you can't reasonably use populate_record/_json with UPDATE is a significant omission. IMO this really speaks for supporting shorthand whole-row assignment, whatever the syntax. Regards, Marti
[HACKERS] Hash index creation warning
Now that we have the create hash index warning in 9.5, I realized that we don't warn about hash indexes with PITR, only crash recovery and streaming. This patch fixes that. Is the wording cannot be used too vague. The CREATE INDEX manual page has the words give wrong answers to queries, which might be better, but is kind of long for an error message. Suggestions? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml new file mode 100644 index e469b17..43df32f *** a/doc/src/sgml/ref/create_index.sgml --- b/doc/src/sgml/ref/create_index.sgml *** Indexes: *** 474,480 Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. ! For these reasons, hash index use is presently discouraged. /para /caution --- 474,481 Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. ! Hash indexes are also not properly restored during point-in-time ! recovery. For these reasons, hash index use is presently discouraged. /para /caution diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c new file mode 100644 index 8a1cb4b..03833d7 *** a/src/backend/commands/indexcmds.c --- b/src/backend/commands/indexcmds.c *** DefineIndex(Oid relationId, *** 491,497 if (strcmp(accessMethodName, hash) == 0) ereport(WARNING, ! (errmsg(hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers))); if (stmt-unique !accessMethodForm-amcanunique) ereport(ERROR, --- 491,497 if (strcmp(accessMethodName, hash) == 0) ereport(WARNING, ! (errmsg(hash indexes are not WAL-logged and thus are not crash-safe and cannot be used for point-in-time recovery or on standby servers))); if (stmt-unique !accessMethodForm-amcanunique) ereport(ERROR, diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out new file mode 100644 index a2bef7a..11325e4 *** a/src/test/regress/expected/create_index.out --- b/src/test/regress/expected/create_index.out *** DROP TABLE array_gin_test; *** 2238,2250 -- HASH -- CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers -- CREATE INDEX hash_ovfl_index ON hash_ovfl_heap USING hash (x int4_ops); -- -- Test functional index --- 2238,2250 -- HASH -- CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used for point-in-time recovery or on standby servers CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used for point-in-time recovery or on standby servers CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used for point-in-time recovery or on standby servers CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used for point-in-time recovery or on standby servers -- CREATE INDEX hash_ovfl_index ON hash_ovfl_heap USING hash (x int4_ops); -- -- Test functional index diff --git a/src/test/regress/expected/enum.out b/src/test/regress/expected/enum.out new file mode 100644 index fa23b52..47ac5a6 *** a/src/test/regress/expected/enum.out --- b/src/test/regress/expected/enum.out *** DROP INDEX enumtest_btree; *** 383,389 -- Hash index / opclass with the = operator -- CREATE INDEX enumtest_hash ON enumtest USING hash (col); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers SELECT *
Re: [HACKERS] [PATCH] add ssl_protocols configuration option
Dag-Erling Smørgrav wrote: Michael Paquier michael.paqu...@gmail.com writes: Please note that new features can only be added to the version currently in development, aka 9.5. I understand this policy. However, this new feature a) has absolutely no impact unless the admin makes a conscious decision to use it and b) will make life much easier for everyone if a POODLE-like vulnerability is discovered in TLS. I see this as vaguely related to http://www.postgresql.org/message-id/20131114202733.gb7...@eldon.alvh.no-ip.org where we want to have SSL behavior configurable in the back branches due to renegotiation issues: there was talk in that thread about introducing new GUC variables in back branches to control the behavior. The current patch really doesn't add much in the way of features (SSLv3 support and so on already exist in back branches) --- what it does add is a way to control whether these are used. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index creation warning
Bruce Momjian wrote Now that we have the create hash index warning in 9.5, I realized that we don't warn about hash indexes with PITR, only crash recovery and streaming. This patch fixes that. Is the wording cannot be used too vague. The CREATE INDEX manual page has the words give wrong answers to queries, which might be better, but is kind of long for an error message. Suggestions? Something like the following is more specific without being more wordy: hash indexes are not WAL-logged: they are corrupted during recovery and changes do not replicate to standby servers. The question is whether we explain the implications of not being WAL-logged in an error message or simply state the fact and let the documentation explain the hazards - basically just output: hash indexes are not WAL-logged and their use is discouraged David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-index-creation-warning-tp5823443p5823445.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Segmentation fault] pg_dump binary-upgrade fail for type without element
Rushabh Lathia rushabh.lat...@gmail.com writes: pg_dump binary-upgrade fail with segmentation fault for type without element. Ooops. Looking further into code I found that rather then fetching typrelid, we can use the already stored typrelid from TypeInfo structure. Agreed. Patch committed, thanks! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index creation warning
David G Johnston david.g.johns...@gmail.com writes: The question is whether we explain the implications of not being WAL-logged in an error message or simply state the fact and let the documentation explain the hazards - basically just output: hash indexes are not WAL-logged and their use is discouraged +1. The warning message is not the place to be trying to explain all the details. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add ssl_protocols configuration option
Alvaro Herrera alvhe...@2ndquadrant.com writes: Dag-Erling Smørgrav wrote: I understand this policy. However, this new feature a) has absolutely no impact unless the admin makes a conscious decision to use it and b) will make life much easier for everyone if a POODLE-like vulnerability is discovered in TLS. I see this as vaguely related to http://www.postgresql.org/message-id/20131114202733.gb7...@eldon.alvh.no-ip.org where we want to have SSL behavior configurable in the back branches due to renegotiation issues: there was talk in that thread about introducing new GUC variables in back branches to control the behavior. The current patch really doesn't add much in the way of features (SSLv3 support and so on already exist in back branches) --- what it does add is a way to control whether these are used. This looks to me like re-fighting the last war. Such a GUC has zero value *unless* some situation exactly like the POODLE bug comes up again, and the odds of that are not high. Moreover, the GUC could easily be misused to decrease rather than increase one's security, if it's carelessly set. Remember that we only recently fixed bugs that prevented use of the latest TLS version. If we have a setting like this, I fully anticipate that people will set it to only use TLS 1.2 and then forget that they ever did that; years from now they'll still be using 1.2 when it's deprecated. So I think the argument that this is a useful security contribution is pretty weak; and on the whole we don't need another marginally-useful GUC. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
CK Tan ck...@vitessedata.com writes: The bigint sum,avg,count case in the example you tried has some optimization. We use int128 to accumulate the bigint instead of numeric in pg. Hence the big speed up. Try the same query on int4 for the improvement where both pg and vitessedb are using int4 in the execution. Well, that's pretty much cheating: it's too hard to disentangle what's coming from JIT vs what's coming from using a different accumulator datatype. If we wanted to depend on having int128 available we could get that speedup with a couple hours' work. But what exactly are you compiling here? I trust not the actual data accesses; that seems far too complicated to try to inline. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Fwd: [HACKERS] Vitesse DB call for testing
Hi, Tom, Sorry for double post to you. Feng -- Forwarded message -- From: Feng Tian ft...@vitessedata.com Date: Fri, Oct 17, 2014 at 10:29 AM Subject: Re: [HACKERS] Vitesse DB call for testing To: Tom Lane t...@sss.pgh.pa.us Hi, Tom, I agree using that using int128 in stock postgres will speed up things too. On the other hand, that is only one part of the equation. For example, if you look at TPCH Q1, the int128 cheating does not kick in at all, but we are 8x faster. I am not sure why do you mean by actual data access. Data is still in stock postgres format on disk. We indeed jit-ed all data fields access (deform tuple).To put things in perspective, I just timed select count(*) and select count(l_orderkey) from tpch1.lineitem; Our code is bottlenecked by memory bandwidth and difference is pretty much invisible. Thanks, Feng ftian=# set vdb_jit = 0; SET Time: 0.155 ms ftian=# select count(*) from tpch1.lineitem; count - 6001215 (1 row) Time: 688.658 ms ftian=# select count(*) from tpch1.lineitem; count - 6001215 (1 row) Time: 690.753 ms ftian=# select count(l_orderkey) from tpch1.lineitem; count - 6001215 (1 row) Time: 819.452 ms ftian=# set vdb_jit = 1; SET Time: 0.167 ms ftian=# select count(*) from tpch1.lineitem; count - 6001215 (1 row) Time: 203.543 ms ftian=# select count(l_orderkey) from tpch1.lineitem; count - 6001215 (1 row) Time: 202.253 ms ftian=# On Fri, Oct 17, 2014 at 10:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: CK Tan ck...@vitessedata.com writes: The bigint sum,avg,count case in the example you tried has some optimization. We use int128 to accumulate the bigint instead of numeric in pg. Hence the big speed up. Try the same query on int4 for the improvement where both pg and vitessedb are using int4 in the execution. Well, that's pretty much cheating: it's too hard to disentangle what's coming from JIT vs what's coming from using a different accumulator datatype. If we wanted to depend on having int128 available we could get that speedup with a couple hours' work. But what exactly are you compiling here? I trust not the actual data accesses; that seems far too complicated to try to inline. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
CK, Before we go any further on this, how is Vitesse currently licensed? last time we talked it was still proprietary. If it's not being open-sourced, we likely need to take discussion off this list. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.9
On 2014-10-17 17:14:16 +0530, Amit Kapila wrote: On Tue, Oct 14, 2014 at 11:34 AM, Amit Kapila amit.kapil...@gmail.com wrote: I am not sure why we are seeing difference even though running on same m/c with same configuration. I have tried many times, but I could not get the numbers you have posted above with HEAD, however now trying with the latest version [1] posted by you, everything seems to be fine at this workload. The data at higher client count is as below: I'll try to reproduce it next week. But I don't think it matters all that much. Personally so far the performance numbers don't seem to indicate much reason to wait any further. We sure improve further, but I don't see much reason to wait because of that. HEAD – commit 494affb Shared_buffers=8GB; Scale Factor = 3000 Client Count/No. Of Runs (tps) 64 128 Run-1 271799 24 Run-2 274341 245207 Run-3 275019 252258 HEAD – commit 494affb + wait free lw_shared_v2 Shared_buffers=8GB; Scale Factor = 3000 Client Count/No. Of Runs (tps) 64 128 Run-1 286209 274922 Run-2 289101 274495 Run-3 289639 273633 So here the results with LW_SHARED were consistently better, right? You saw performance degradations here earlier? So I am planning to proceed further with the review/test of your latest patch. According to me, below things are left from myside: a. do some basic tpc-b tests with patch b. re-review latest version posted by you Cool! I know that you have posted optimization into StrategyGetBuffer() in this thread, however I feel we can evaluate it separately unless you are of opinion that both the patches should go together. [1] http://www.postgresql.org/message-id/20141010111027.gc6...@alap3.anarazel.de No, I don't think they should go together - I wrote that patch because it was the bottleneck in the possibly regressing test and I wanted to see the full effect. Although I do think we should apply it ;) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
On Fri, Oct 17, 2014 at 11:08 AM, Feng Tian ft...@vitessedata.com wrote: I agree using that using int128 in stock postgres will speed up things too. On the other hand, that is only one part of the equation. For example, if you look at TPCH Q1, the int128 cheating does not kick in at all, but we are 8x faster. I'm curious about how the numbers look when stock Postgres is built with the same page size as your fork. You didn't mention whether or not your Postgres numbers came from a standard build. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
On 2014-10-17 13:12:27 -0400, Tom Lane wrote: Well, that's pretty much cheating: it's too hard to disentangle what's coming from JIT vs what's coming from using a different accumulator datatype. If we wanted to depend on having int128 available we could get that speedup with a couple hours' work. I think doing that when configure detects int128 would make a great deal of sense. It's not like we'd save a great deal of complicated code by removing the existing accumulator... We'd still have to return a numeric, but that's likely lost in the noise cost wise. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
On Fri, Oct 17, 2014 at 1:21 PM, Peter Geoghegan p...@heroku.com wrote: On Fri, Oct 17, 2014 at 11:08 AM, Feng Tian ft...@vitessedata.com wrote: I agree using that using int128 in stock postgres will speed up things too. On the other hand, that is only one part of the equation. For example, if you look at TPCH Q1, the int128 cheating does not kick in at all, but we are 8x faster. I'm curious about how the numbers look when stock Postgres is built with the same page size as your fork. You didn't mention whether or not your Postgres numbers came from a standard build. I downloaded the 8kb varant. vitesse (median of 3): postgres=# select count(*), sum(i*i), avg(i) from t; count │sum │ avg ─┼┼─ 100 │ 338350 │ 50.5000 (1 row) Time: 39.197 ms stock (median of 3): postgres=# select count(*), sum(i*i), avg(i) from t; count │sum │ avg ─┼┼─ 100 │ 338350 │ 50.5000 (1 row) Time: 667.362 ms (stock int4 ops) postgres=# select sum(1::int4) from t; sum ─ 100 (1 row) Time: 75.265 ms What I'm wondering is how complex the hooks are that tie the technology in. Unless a bsd licensed patch materializes, the conversation (beyond the intitial wow! factor) should probably focus on a possible integration points and/or implementation of technology into core in a general way. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vitesse DB call for testing
Andres Freund and...@2ndquadrant.com writes: On 2014-10-17 13:12:27 -0400, Tom Lane wrote: Well, that's pretty much cheating: it's too hard to disentangle what's coming from JIT vs what's coming from using a different accumulator datatype. If we wanted to depend on having int128 available we could get that speedup with a couple hours' work. I think doing that when configure detects int128 would make a great deal of sense. Yeah, I was wondering about that myself: use int128 if available, else fall back on existing code path. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index creation warning
On Fri, Oct 17, 2014 at 12:56:52PM -0400, Tom Lane wrote: David G Johnston david.g.johns...@gmail.com writes: The question is whether we explain the implications of not being WAL-logged in an error message or simply state the fact and let the documentation explain the hazards - basically just output: hash indexes are not WAL-logged and their use is discouraged +1. The warning message is not the place to be trying to explain all the details. OK, updated patch attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml new file mode 100644 index e469b17..43df32f *** a/doc/src/sgml/ref/create_index.sgml --- b/doc/src/sgml/ref/create_index.sgml *** Indexes: *** 474,480 Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. ! For these reasons, hash index use is presently discouraged. /para /caution --- 474,481 Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. ! Hash indexes are also not properly restored during point-in-time ! recovery. For these reasons, hash index use is presently discouraged. /para /caution diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c new file mode 100644 index 8a1cb4b..3c1e90e *** a/src/backend/commands/indexcmds.c --- b/src/backend/commands/indexcmds.c *** DefineIndex(Oid relationId, *** 491,497 if (strcmp(accessMethodName, hash) == 0) ereport(WARNING, ! (errmsg(hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers))); if (stmt-unique !accessMethodForm-amcanunique) ereport(ERROR, --- 491,497 if (strcmp(accessMethodName, hash) == 0) ereport(WARNING, ! (errmsg(hash indexes are not WAL-logged and their use is discouraged))); if (stmt-unique !accessMethodForm-amcanunique) ereport(ERROR, diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out new file mode 100644 index a2bef7a..8326e94 *** a/src/test/regress/expected/create_index.out --- b/src/test/regress/expected/create_index.out *** DROP TABLE array_gin_test; *** 2238,2250 -- HASH -- CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers -- CREATE INDEX hash_ovfl_index ON hash_ovfl_heap USING hash (x int4_ops); -- -- Test functional index --- 2238,2250 -- HASH -- CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops); ! WARNING: hash indexes are not WAL-logged and their use is discouraged CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops); ! WARNING: hash indexes are not WAL-logged and their use is discouraged CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops); ! WARNING: hash indexes are not WAL-logged and their use is discouraged CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops); ! WARNING: hash indexes are not WAL-logged and their use is discouraged -- CREATE INDEX hash_ovfl_index ON hash_ovfl_heap USING hash (x int4_ops); -- -- Test functional index diff --git a/src/test/regress/expected/enum.out b/src/test/regress/expected/enum.out new file mode 100644 index fa23b52..1a61a5b *** a/src/test/regress/expected/enum.out --- b/src/test/regress/expected/enum.out *** DROP INDEX enumtest_btree; *** 383,389 -- Hash index / opclass with the = operator -- CREATE INDEX enumtest_hash ON enumtest USING hash (col); ! WARNING: hash indexes are not WAL-logged and thus are not crash-safe and cannot be used on standby servers SELECT * FROM enumtest WHERE col = 'orange'; col --- 383,389 -- Hash index / opclass with the = operator -- CREATE INDEX enumtest_hash ON enumtest USING hash (col); ! WARNING: hash indexes are
Re: [HACKERS] Vitesse DB call for testing
Happy to contribute to that decision :-) On Fri, Oct 17, 2014 at 11:35 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-10-17 13:12:27 -0400, Tom Lane wrote: Well, that's pretty much cheating: it's too hard to disentangle what's coming from JIT vs what's coming from using a different accumulator datatype. If we wanted to depend on having int128 available we could get that speedup with a couple hours' work. I think doing that when configure detects int128 would make a great deal of sense. Yeah, I was wondering about that myself: use int128 if available, else fall back on existing code path. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: dynahash replacement for buffer table
On 2014-10-16 20:22:24 -0400, Robert Haas wrote: On Thu, Oct 16, 2014 at 6:53 PM, Andres Freund and...@2ndquadrant.com wrote: When using shared_buffers = 96GB there's a performance benefit, but not huge: master (f630b0dd5ea2de52972d456f5978a012436115e): 153621.8 master + LW_SHARED + lockless StrategyGetBuffer(): 477118.4 master + LW_SHARED + lockless StrategyGetBuffer() + chash: 496788.6 master + LW_SHARED + lockless StrategyGetBuffer() + chash-nomb: 499562.7 But with shared_buffers = 16GB: master (f630b0dd5ea2de52972d456f5978a012436115e): 177302.9 master + LW_SHARED + lockless StrategyGetBuffer(): 206172.4 master + LW_SHARED + lockless StrategyGetBuffer() + chash: 413344.1 master + LW_SHARED + lockless StrategyGetBuffer() + chash-nomb: 426405.8 Very interesting. This doesn't show that chash is the right solution, but it definitely shows that doing nothing is the wrong solution. Absolutely. The thing worrying me most (but not all that much on an absolute scale) about chash is that lots of the solutions to memory management it builds are specific to it... And generalizing afterwards will be hard because we'll have to prove that that general solution is as performant as the special case code... It shows that, even with the recent bump to 128 lock manager partitions, and LW_SHARED on top of that, workloads that actually update the buffer mapping table still produce a lot of contention there. FWIW, I couldn't see much of a benefit without LW_SHARED even though I a *few* things. The bottleneck simply is completely elsewhere. This hasn't been obvious to me from profiling, but the numbers above make it pretty clear. So I'm not super surprised about that. There very well might be cases where it's a bad bottleneck before, but I've not seen them. It also seems to suggest that trying to get rid of the memory barriers isn't a very useful optimization project. I'm not yet convinced of that. Yes, it's not showing up hugely in the numbers here, but that's simply because the workload is again completely dominated by the kernel copying data for the read()s, GetSnapshotData(), the buffer mapping cache misses and a few other familiar friends. We might get a couple of percent out of it, but it's pretty small potatoes, so unless it can be done more easily than I suspect, it's probably not worth bothering with. I still think that moving the barrier to the reading side would be simple (implementation wise) and correct for x86. If you think about it, otherwise our spinlock implementation for x86 would be broken. As a unlock is just a compiler barrier the full barrier on acquire better be a full synchronization point. Am I missing something? An approach I think might have more promise is to have bufmgr.c call the CHash stuff directly instead of going through buf_table.c. I don't see all that much point in buf_table.c currently - on the other hand it has lead to it replacing the buffer mapping being slightly easier... Maybe it should just go to a header... Right now, for example, BufferAlloc() creates and initializes a BufferTag and passes a pointer to that buffer tag to BufTableLookup, which copies it into a BufferLookupEnt. But it would be just as easy for BufferAlloc() to put the BufferLookupEnt on its own stack, and then you wouldn't need to copy the data an extra time. Now a 20-byte copy isn't a lot, but it's completely unnecessary and looks easy to get rid of. Worthwile to try. I had to play with setting max_connections+1 sometimes to get halfway comparable results for master - unaligned data otherwise causes wierd results otherwise. Without doing that the performance gap between master 96/16G was even bigger. We really need to fix that... This is pretty awesome. Thanks. I wasn't quite sure how to test this or where the workloads that it would benefit would be found, so I appreciate you putting time into it. And I'm really glad to hear that it delivers good results. I wasn't sure either ;). Lucky that it played out so impressively. After the first results I was nearly ready to send out a 'Meh.' type of message ;) Btw, CHashTableGetNode isn't exactly cheap. It shows up noticeably in profiles. It results in several non-pipelineable instructions in a already penalized section of the code... Replacing arena_stride by a constant provided measurable improvements (no imul is required anymore, instead you get shr;lea or so). I'm not sure how to deal with that. If it shows up even on my quite new laptop, it'll be more so noticeable on older x86 platforms. I think it would be useful to plumb the chash statistics into the stats collector or at least a debugging dump of some kind for testing. I'm not sure it's solid enough at this point to be in the stats collector. But I surely would like to access it somehow. I'm e.g. absolutely not sure that your loadfactor is good and it'd be
Re: [HACKERS] documentation update for doc/src/sgml/func.sgml
On 09/14/2014 06:32 PM, Peter Eisentraut wrote: On 9/12/14 3:13 PM, Andreas 'ads' Scherbaum wrote: Of course a general rule how to link to WP would be nice ... I think Wikipedia links should be avoided altogether. We can assume that readers are technically proficient to look up general technical concepts on their own using a reference system of their choice. In cases where a link is warranted, it is better to construct a proper bibliographic citation to the primary source material, such as an IEEE standard or an academic paper, in a way that will stand the test of time. That's a clear statement, and makes sense. Should be written down somewhere, so it can be found again. Independent of that, it is actually not correct that we use the IEEE's rules, because we don't use any rules, that is up to the operating system/platform. While most platforms indeed do use the IEEE floating-point standard more less, some don't. Section 8.1.3 tries to point that out. New version attached, WP link removed, wording changed. Regards, -- Andreas 'ads' Scherbaum German PostgreSQL User Group European PostgreSQL User Group - Board of Directors Volunteer Regional Contact, Germany - PostgreSQL Project diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 13c71af..d54cf58 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -924,6 +924,25 @@ /tgroup /table + para +For functions like round(), log() and sqrt() which run against +either fixed-precision (NUMERIC) or floating-point numbers (e.g. REAL), +note that the results of these operations will differ according +to the input type due to rounding. This is most observable with +round(), which can end up rounding down as well as up for +any #.5 value. productnamePostgreSQL/productname's handling +of floating-point values depends on the operating system, which +may or may not follow the IEEE floating-point standard. + /para + + para +The bitwise operators work only on integral data types, whereas +the others are available for all numeric data types. The bitwise +operators are also available for the bit string types +typebit/type and typebit varying/type, as +shown in xref linkend=functions-bit-string-op-table. + /para + para xref linkend=functions-math-random-table shows functions for generating random numbers. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 2014-10 CommitFest
Kevin Grittner kgri...@ymail.com wrote: Unless someone else wants to pick it up, I'll manage this one. Since there was no previous warning, I'll allow a grace day for the cut-off on submissions for this CF; if it isn't registered in the web application 24 hours after this email, I will move it to the next CF. So look for those patches that fell through the cracks without getting registered, and post the ones you've got. [ rings bell ] So the 2014-10 CF is now In Progress and the 2014-12 CF is Open. There are 86 patches listed at this time: Needs Review: 69 Waiting on Author: 5 Ready for Committer: 10 Committed: 2 We've got four weeks if we're going to return or commit all of these by the scheduled end of the CF. Since many of these have already had significant review, my hope is that we can hit the ground running and wrap things up in a timely fashion so we have a break before the start of the *next* CF. Please remember, if you are submitting patches, you should be reviewing patches, too. Otherwise development will hit a logjam. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgcrypto: PGP signatures
On Mon, Sep 15, 2014 at 4:37 AM, Marko Tiikkaja ma...@joh.to wrote: I've changed the patch back to ignore signatures when not using the decrypt_verify() functions in the attached. Hi Marko, This patch needs a rebase now that the armor header patch has been committed. Thanks, Jeff
[HACKERS] json function volatility
Following up something Pavel wrote, I notice that json_agg() and json_object_agg() are both marked as immutable, even though they invoke IO functions, while json_object is marked stable, even though it does not, and can probably be marked as immutable. Mea maxima culpa. I'm not sure what we should do about these things now. Is it a tragedy if we let these escape into the 9.4 release that way? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views don't show up in information_schema
On 10/16/14 9:45 AM, Stephen Frost wrote: Alright, coming back to this, I have to ask- how are matviews different from views from the SQL standard's perspective? I tried looking through the standard to figure it out (and I admit that I probably missed something), but the only thing appears to be a statement in the standard that (paraphrased) functions are run with the view is queried and that strikes me as a relatively minor point.. To me, the main criterion is that you cannot DROP VIEW a materialized view. Generally, if the information schema claims that a view/table/function/etc. named foo exists, then I should be able to operate on foo using the basic operations for a view/table/function/etc. of that name. I think think DROP VIEW is a basic operation for a view. Others might disagree. More subtly, if we claim that a materialized view is a view, then we cannot have asynchronously updated materialized views, because then we have different semantics. All of this is a judgement call in corner cases. But I don't think this is a corner case at all. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json function volatility
On Fri, Oct 17, 2014 at 3:03 PM, Andrew Dunstan and...@dunslane.net wrote: Following up something Pavel wrote, I notice that json_agg() and json_object_agg() are both marked as immutable, even though they invoke IO functions, while json_object is marked stable, even though it does not, and can probably be marked as immutable. Mea maxima culpa. I'm not sure what we should do about these things now. Is it a tragedy if we let these escape into the 9.4 release that way? Is it too late to change them? Either way, it seems fairly implausible someone would come up with a case to stick json_agg(), or any aggregate function really, inside of an index. So it's not exactly the crime of the century. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Issue with mkdtemp() in port.h
A little while back some users started complaining that the contrib module I develop (MADlib) was failing to build with the following error: -- /usr/include/postgresql/9.2/server/port.h:480:32: error: declaration of 'char* mkdtemp(char*)' has a different exception specifier /usr/include/stdlib.h:663:14: error: from previous declaration 'char* mkdtemp(char*) throw ()' -- After some research I've tracked this down to the following commit from ~4 months ago: -- https://github.com/postgres/postgres/commit/a919937f112eb2f548d5f9bd1b3a7298375e6380 -- Which added a definition of mkdtemp into port.h that conflicts with the definition in the system header files. The following is a simple program that demonstrates the issue: -- bash$ cat /tmp/foo.cpp #include postgres.h int main() { return 0; } bash$ gcc -o foo foo.cpp -I`pg_config --includedir-server` -pedantic In file included from /usr/pgsql-9.2/include/server/c.h:860, from /usr/pgsql-9.2/include/server/postgres.h:47, from foo.cpp:1: /usr/pgsql-9.2/include/server/port.h:479: error: declaration of ‘char* mkdtemp(char*)’ throws different exceptions /usr/include/stdlib.h:663: error: from previous declaration ‘char* mkdtemp(char*) throw ()’ -- Reproducible on ubuntu 14.04, centos6, and likely others. Regards, Caleb
[HACKERS] Optimizer on sort aggregate
Hi, Consider the following queries. create table t(i int, j int, k int, t text); insert into t select i, i % 100, i % 1000, 'AABBCCDD' || i from generate_series(1, 100) i; ftian=# explain select count(distinct j) from t group by t, i; QUERY PLAN GroupAggregate (cost=158029.84..178029.84 rows=100 width=22) - Sort (cost=158029.84..160529.84 rows=100 width=22) Sort Key: t, i - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (4 rows) The query, select count(distinct j) from t group by t, i; runs for 35 seconds. However, if I change the query to, select count(distinct j) from t group by i, t; -- note switching the ordering select count(distinct j) from t group by decode(t, 'escape'), i; -- convert t to bytea Run times are just about 5 and 6.5 seconds. The reason is clear, compare a string with collation is slow, which is well understood by pg hackers. However, here, the sorting order is forced by the planner, not user. Planner can do the following optimizations, 1. for the sort we generated for sort agg, planner can switch column ordering, put int before string, 2. for the sort we generated for sort agg, use bytea compare instead of string compare. They will bring big improvement to this common query. Is this something reasonable? Thanks,
Re: [HACKERS] json function volatility
On Fri, Oct 17, 2014 at 1:44 PM, Merlin Moncure mmonc...@gmail.com wrote: Is it too late to change them? Either way, it seems fairly implausible someone would come up with a case to stick json_agg(), or any aggregate function really, inside of an index. So it's not exactly the crime of the century. Indexes reject aggregates outright as acceptable expressions for indexing. That's not the only issue, of course. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json function volatility
Merlin Moncure wrote: On Fri, Oct 17, 2014 at 3:03 PM, Andrew Dunstan and...@dunslane.net wrote: Following up something Pavel wrote, I notice that json_agg() and json_object_agg() are both marked as immutable, even though they invoke IO functions, while json_object is marked stable, even though it does not, and can probably be marked as immutable. Mea maxima culpa. I'm not sure what we should do about these things now. Is it a tragedy if we let these escape into the 9.4 release that way? Is it too late to change them? One thing to consider is the catversion bump, which we don't want this late in the cycle. Still, you could change the catalogs but not the version, and advise those with the older definitions to tweak the catalogs by hand if they need it. I think we did this once. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views don't show up in information_schema
On 10/17/14, 4:31 AM, David G Johnston wrote: Since the standard doesn't distinguish between read and write aspects of the object types there isn't a safe way to add matviews to the information schema that doesn't violate the intent of the provided view. If the application/users wants to support/use PostgreSQL specific features it/they have to be ready and able to use the catalog. +1. If we add matviews to information_schema while they're not part of that standard then we're going to regret it at some point. Perhaps the answer to this problem is to restart the old pg_newsysviews project. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json function volatility
Alvaro Herrera alvhe...@2ndquadrant.com writes: Merlin Moncure wrote: On Fri, Oct 17, 2014 at 3:03 PM, Andrew Dunstan and...@dunslane.net wrote: Following up something Pavel wrote, I notice that json_agg() and json_object_agg() are both marked as immutable, even though they invoke IO functions, while json_object is marked stable, even though it does not, and can probably be marked as immutable. Mea maxima culpa. I'm not sure what we should do about these things now. Is it a tragedy if we let these escape into the 9.4 release that way? Is it too late to change them? One thing to consider is the catversion bump, which we don't want this late in the cycle. Still, you could change the catalogs but not the version, and advise those with the older definitions to tweak the catalogs by hand if they need it. I think we did this once. I'm fairly sure that the system doesn't actually pay attention to the volatility marking of aggregates, so there's no huge harm done by the incorrect markings of those. The incorrect marking of json_object might be a small optimization block. +1 for changing these in 9.4 without bumping catversion. I don't think we need to give advice for manual corrections, either: the risk of doing that wrong probably outweighs the value of fixing it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
Hi, This week we had one of the most annoying problems I've ever encountered with postgres. We had a big index on multiple columns, say, foo(a, b, c). According to pg_stat_all_indexes the index was being used *all the time*. However, after looking into our queries more closely, it turns out that it was only being used to look up statistics for the foo.a column to estimate merge scan viability during planning. But this took hours for two people to track down. So what I'd like to have is a way to be able to distinguish between indexes being used to answer queries, and ones being only used for stats lookups during planning. Perhaps the easiest way would be adding a new column or two into pg_stat_all_indexes, which we would increment in get_actual_variable_range() when fetching data. Any thoughts? .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
Marko Tiikkaja ma...@joh.to writes: This week we had one of the most annoying problems I've ever encountered with postgres. We had a big index on multiple columns, say, foo(a, b, c). According to pg_stat_all_indexes the index was being used *all the time*. However, after looking into our queries more closely, it turns out that it was only being used to look up statistics for the foo.a column to estimate merge scan viability during planning. But this took hours for two people to track down. So what I'd like to have is a way to be able to distinguish between indexes being used to answer queries, and ones being only used for stats lookups during planning. Why? Used is used. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/17/14, 11:47 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: This week we had one of the most annoying problems I've ever encountered with postgres. We had a big index on multiple columns, say, foo(a, b, c). According to pg_stat_all_indexes the index was being used *all the time*. However, after looking into our queries more closely, it turns out that it was only being used to look up statistics for the foo.a column to estimate merge scan viability during planning. But this took hours for two people to track down. So what I'd like to have is a way to be able to distinguish between indexes being used to answer queries, and ones being only used for stats lookups during planning. Why? Used is used. Because I don't need a 30GB index on foo(a,b,c) to look up statistics. If I ever have a problem, I can replace it with a 5GB one on foo(a). .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
Marko Tiikkaja ma...@joh.to writes: On 10/17/14, 11:47 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: So what I'd like to have is a way to be able to distinguish between indexes being used to answer queries, and ones being only used for stats lookups during planning. Why? Used is used. Because I don't need a 30GB index on foo(a,b,c) to look up statistics. If I ever have a problem, I can replace it with a 5GB one on foo(a). Well, the index might've been getting used in queries too in a way that really only involved the first column. I think you're solving the wrong problem here. The right problem is how to identify indexes that are being used in a way that doesn't exploit all the columns. Which is not necessarily wrong in itself --- what you'd want is to figure out when the last column(s) are *never* used. The existing stats aren't terribly helpful for that, I agree. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/17/14, 4:49 PM, Marko Tiikkaja wrote: On 10/17/14, 11:47 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: This week we had one of the most annoying problems I've ever encountered with postgres. We had a big index on multiple columns, say, foo(a, b, c). According to pg_stat_all_indexes the index was being used *all the time*. However, after looking into our queries more closely, it turns out that it was only being used to look up statistics for the foo.a column to estimate merge scan viability during planning. But this took hours for two people to track down. So what I'd like to have is a way to be able to distinguish between indexes being used to answer queries, and ones being only used for stats lookups during planning. Why? Used is used. Because I don't need a 30GB index on foo(a,b,c) to look up statistics. If I ever have a problem, I can replace it with a 5GB one on foo(a). That problem can exist with user queries too. Perhaps it would be better to find a way to count scans that didn't use all the fields in the index. I do also see value in differentiating planning use from real query processing; not doing that can certainly cause confusion. What I don't know is if the added stats bloat is worth it. If we do go down that road, I think it'd be better to add an indicator to EState. Aside from allowing stats for all planning access, it should make it less likely that someone adds a new access path and forgets to mark it as internal (especially if the added field defaults to an invalid value). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue with mkdtemp() in port.h
Caleb Welton cwel...@pivotal.io writes: A little while back some users started complaining that the contrib module I develop (MADlib) was failing to build with the following error: /usr/include/postgresql/9.2/server/port.h:480:32: error: declaration of 'char* mkdtemp(char*)' has a different exception specifier /usr/include/stdlib.h:663:14: error: from previous declaration 'char* mkdtemp(char*) throw ()' After some research I've tracked this down to the following commit from ~4 months ago: https://github.com/postgres/postgres/commit/a919937f112eb2f548d5f9bd1b3a7298375e6380 Hm. Looks like the extern that added should have been protected by #ifndef HAVE_MKDTEMP similarly to the other cases where we conditionally provide a substitute function. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/17/14, 11:59 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: On 10/17/14, 11:47 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: So what I'd like to have is a way to be able to distinguish between indexes being used to answer queries, and ones being only used for stats lookups during planning. Why? Used is used. Because I don't need a 30GB index on foo(a,b,c) to look up statistics. If I ever have a problem, I can replace it with a 5GB one on foo(a). Well, the index might've been getting used in queries too in a way that really only involved the first column. I think you're solving the wrong problem here. The right problem is how to identify indexes that are being used in a way that doesn't exploit all the columns. I'm not sure I agree with that. Even if there was some information the planner could have extracted out of the index by using all columns (thus appearing fully used in these hypothetical new statistics), I still would've wanted the index gone. But in this particular case, an index on foo(a) alone was not selective enough and it would have been a bad choice for practically every query, so I'm not sure what good those statistics were in the first place. I think there's a big difference between this index was used to look up stuff for planning and this index was used to answer queries quickly. In my mind the first one belongs to the category this index was considered, and the latter is this index was actually useful. But maybe I'm not seeing the big picture here. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trailing comma support in SELECT statements
On 10/16/14, 11:48 PM, David Johnston wrote: We might as well allow a final trailing (or initial leading) comma on a values list at the same time: snip do you know, so this feature is a proprietary and it is not based on ANSI/SQL? Any user, that use this feature and will to port to other database will hate it. I've got no complaint if at the same time means that neither behavior is ever implemented... As I originally posted, if we're going to do this I think we should do it *EVERYWHERE* commas are used as delimiters, save COPY input and output. Or we should at least get close to doing it everywhere. I think the only way things could get more annoying is if we accepted extra commas in SELECT but not in CREATE TABLE (as one example). To me completeness is more important than whether we do it or not; that said, I like the idea (as well as supporting leading extra commas). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
Marko Tiikkaja ma...@joh.to writes: On 10/17/14, 11:59 PM, Tom Lane wrote: Well, the index might've been getting used in queries too in a way that really only involved the first column. I think you're solving the wrong problem here. The right problem is how to identify indexes that are being used in a way that doesn't exploit all the columns. I'm not sure I agree with that. Even if there was some information the planner could have extracted out of the index by using all columns (thus appearing fully used in these hypothetical new statistics), I still would've wanted the index gone. But in this particular case, an index on foo(a) alone was not selective enough and it would have been a bad choice for practically every query, so I'm not sure what good those statistics were in the first place. Those stats were perfectly valid: what the planner is looking for is accurate minimum and maximum values for the index's leading column, and that's what it got. You're correct that a narrower index could have given the same results with a smaller disk footprint, but the planner got the results it needed from the index you provided for it to work with. I think there's a big difference between this index was used to look up stuff for planning and this index was used to answer queries quickly. I think that's utter nonsense. Even if there were any validity to the position, it wouldn't be enough to justify doubling the stats footprint in order to track system-driven accesses separately from query-driven ones. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trailing comma support in SELECT statements
Jim Nasby jim.na...@bluetreble.com writes: As I originally posted, if we're going to do this I think we should do it *EVERYWHERE* commas are used as delimiters, save COPY input and output. Or we should at least get close to doing it everywhere. I think the only way things could get more annoying is if we accepted extra commas in SELECT but not in CREATE TABLE (as one example). To me completeness is more important than whether we do it or not; that said, I like the idea (as well as supporting leading extra commas). Yeah, exactly. Personally I'm *not* for this, but if we do it we should do it consistently: every comma-separated list in the SQL syntax should work the same. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Superuser connect during smart shutdown
On 10/16/14, 11:46 PM, David G Johnston wrote: Tom Lane-2 wrote Something else mentioned was that once you start a smart shutdown you have no good way (other than limited ps output) to see what the shutdown is waiting on. I'd like to have some way to get back into the database to see what's going on. Perhaps we could allow superusers to connect while waiting for shutdown. I think this idea is going to founder on the fact that the postmaster has no way to tell whether an incoming connection is for a superuser. You don't find that out until you've connected to a database and run a transaction (so you can read pg_authid). And by that point, you've already had a catastrophic impact on any attempt to shut things down. This quote from the documentation seems suspect in light of your comment... While backup mode is active, new connections will still be allowed, but only to superusers (this exception allows a superuser to connect to terminate online backup mode). http://www.postgresql.org/docs/9.3/interactive/server-shutdown.html check_hba() does if (!check_role(port-user_name, roleid, hba-roles)) continue; And check_role(char **newval, void **extra, GucSource source) does is_superuser = ((Form_pg_authid) GETSTRUCT(roleTup))-rolsuper; ... myextra-roleid = roleid; myextra-is_superuser = is_superuser; *extra = (void *) myextra; So presumably with some changes to how we're calling check_role() we could determine if port-user_name is a superuser. I also like the idea of specifying that a connection should be terminated by a smart shutdown; I agree that'd be useful for monitoring tools and what-not. If folks agree with that I can take a stab at implementing it. Since I tend to be paranoid, I like smart being the default, but seems I'm in the minority there. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow format 0000-0000-0000 in postgresql MAC parser
It has been registered now (https://commitfest.postgresql.org/action/patch_view?id=1585). I've got an updated version of the patch with the documentation fix. Looks like the patch is all good. I'm marking as ready for commiter. On a side note, i'm noticing from http://en.wikipedia.org/wiki/MAC_address, that there is three numbering namespace for MAC: MAC-48, EUI-48 and EUI-64. The last one is 64 bits long (8 bytes). Currently PostgreSQL's macaddr is only 6 bytes long. Should we change it to 8 bytes (not in this patch, of course)? Regards, -- Ali Akbar
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/18/14, 12:15 AM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: I think there's a big difference between this index was used to look up stuff for planning and this index was used to answer queries quickly. I think that's utter nonsense. Well you probably know a bit more about the optimizer than I do. But I can't see a case where the stats provided by the index would be useful for choosing between two (or more) plans that don't use the index in the actual query. If you're saying that there are such cases, then clearly I don't know something, and my thinking is in the wrong here. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizer on sort aggregate
The query, select count(distinct j) from t group by t, i; runs for 35 seconds. However, if I change the query to, select count(distinct j) from t group by i, t; -- note switching the ordering select count(distinct j) from t group by decode(t, 'escape'), i; -- convert t to bytea Run times are just about 5 and 6.5 seconds. The reason is clear, compare a string with collation is slow, which is well understood by pg hackers. However, here, the sorting order is forced by the planner, not user. Planner can do the following optimizations, Interesting. I got following result: test=# explain analyze select count(distinct j) from t group by t, i; QUERY PLAN -- GroupAggregate (cost=137519.84..157519.84 rows=100 width=22) (actual time=1332.937..2431.238 rows=100 loops=1) Group Key: t, i - Sort (cost=137519.84..140019.84 rows=100 width=22) (actual time=1332.922..1507.413 rows=100 loops=1) Sort Key: t, i Sort Method: external merge Disk: 33232kB - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (actual time=0.006..131.406 rows=100 loops=1) Planning time: 0.031 ms Execution time: 2484.271 ms (8 rows) Time: 2484.520 ms test=# explain analyze select count(distinct j) from t group by i, t; QUERY PLAN -- GroupAggregate (cost=137519.84..157519.84 rows=100 width=22) (actual time=602.510..1632.087 rows=100 loops=1) Group Key: i, t - Sort (cost=137519.84..140019.84 rows=100 width=22) (actual time=602.493..703.274 rows=100 loops=1) Sort Key: i, t Sort Method: external sort Disk: 33240kB - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (actual time=0.014..129.213 rows=100 loops=1) Planning time: 0.176 ms Execution time: 1685.575 ms (8 rows) Time: 1687.641 ms Not so big difference here (maybe because I use SSD) but there is still about 50% difference in execution time. Note that I disable locale support. 1. for the sort we generated for sort agg, planner can switch column ordering, put int before string, 2. for the sort we generated for sort agg, use bytea compare instead of string compare. They will bring big improvement to this common query. Is this something reasonable? Not looking at sort and planner code closely, I guess planner does not recognize the cost difference between external merge and external sort because cost estimations for sort step in each plan are exactly same (137519.84..140019.84). However I am not sure if we should fix the planner or should fix our sort machinery since it is possible that the sort machinery should not expose such a big difference in the two sort methods. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizer on sort aggregate
On Sat, Oct 18, 2014 at 5:10 AM, Feng Tian feng...@gmail.com wrote: Hi, Consider the following queries. create table t(i int, j int, k int, t text); insert into t select i, i % 100, i % 1000, 'AABBCCDD' || i from generate_series(1, 100) i; ftian=# explain select count(distinct j) from t group by t, i; QUERY PLAN GroupAggregate (cost=158029.84..178029.84 rows=100 width=22) - Sort (cost=158029.84..160529.84 rows=100 width=22) Sort Key: t, i - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (4 rows) The query, select count(distinct j) from t group by t, i; runs for 35 seconds. However, if I change the query to, select count(distinct j) from t group by i, t; -- note switching the ordering select count(distinct j) from t group by decode(t, 'escape'), i; -- convert t to bytea Run times are just about 5 and 6.5 seconds. The reason is clear, compare a string with collation is slow, which is well understood by pg hackers. However, here, the sorting order is forced by the planner, not user. Planner can do the following optimizations, 1. for the sort we generated for sort agg, planner can switch column ordering, put int before string, 2. for the sort we generated for sort agg, use bytea compare instead of string compare. They will bring big improvement to this common query. Is this something reasonable? It seems like it might be worth looking into, but I think it's likely more complex than sorting the group by order into narrowest column first. If the query was: select count(distinct j) from t group by t, i order by t, i; Then if that was rewritten to make the group by i,t then the planner would then need to perform an additional sort on t,i to get the correct order by for the final result, which may or may not be faster, it would depend on how many groups there were to sort. Though I guess you could possibly just skip the optimisation if the next node up didn't need any specific ordering. I also wonder if taking into account the column's width is not quite enough. For example if the 'i' column only had 1 distinct value, then performing the group by on 'i' first wouldn't help at all. Keep in mind that the columns could be much closer in width than in your example, e.g int and bigint. Perhaps you'd need to include some sort of heuristics to look at the number of distinct values and the width, and form some sort of weighted estimates about which column to put first. Regards David Rowley
Re: [HACKERS] Optimizer on sort aggregate
On Sat, Oct 18, 2014 at 12:35 PM, Tatsuo Ishii is...@postgresql.org wrote: The query, select count(distinct j) from t group by t, i; runs for 35 seconds. However, if I change the query to, select count(distinct j) from t group by i, t; -- note switching the ordering select count(distinct j) from t group by decode(t, 'escape'), i; -- convert t to bytea Run times are just about 5 and 6.5 seconds. The reason is clear, compare a string with collation is slow, which is well understood by pg hackers. However, here, the sorting order is forced by the planner, not user. Planner can do the following optimizations, Interesting. I got following result: test=# explain analyze select count(distinct j) from t group by t, i; QUERY PLAN -- GroupAggregate (cost=137519.84..157519.84 rows=100 width=22) (actual time=1332.937..2431.238 rows=100 loops=1) Group Key: t, i - Sort (cost=137519.84..140019.84 rows=100 width=22) (actual time=1332.922..1507.413 rows=100 loops=1) Sort Key: t, i Sort Method: external merge Disk: 33232kB - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (actual time=0.006..131.406 rows=100 loops=1) Planning time: 0.031 ms Execution time: 2484.271 ms (8 rows) Time: 2484.520 ms test=# explain analyze select count(distinct j) from t group by i, t; QUERY PLAN -- GroupAggregate (cost=137519.84..157519.84 rows=100 width=22) (actual time=602.510..1632.087 rows=100 loops=1) Group Key: i, t - Sort (cost=137519.84..140019.84 rows=100 width=22) (actual time=602.493..703.274 rows=100 loops=1) Sort Key: i, t Sort Method: external sort Disk: 33240kB - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (actual time=0.014..129.213 rows=100 loops=1) Planning time: 0.176 ms Execution time: 1685.575 ms (8 rows) Time: 1687.641 ms Not so big difference here (maybe because I use SSD) but there is still about 50% difference in execution time. Note that I disable locale support. I think this is more likely your locale settings, as if I do: create table t(i int, j int, k int, t text collate C); The GROUP BY t,i runs about 25% faster. I've not looked at it yet, but Peter G's patch here https://commitfest.postgresql.org/action/patch_view?id=1462 will quite likely narrow the performance gap between the 2 queries. Regards David Rowley
Re: [HACKERS] Materialized views don't show up in information_schema
* Nicolas Barbier (nicolas.barb...@gmail.com) wrote: 2014-10-16 Stephen Frost sfr...@snowman.net: Alright, coming back to this, I have to ask- how are matviews different from views from the SQL standard's perspective? Matviews that are always up to date when you access them are semantically exactly the same as normal views. Matviews that can get out of date, however, are not. And when we have matviews which can be kept up to date..? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Materialized views don't show up in information_schema
* Peter Eisentraut (pete...@gmx.net) wrote: On 10/16/14 9:45 AM, Stephen Frost wrote: Alright, coming back to this, I have to ask- how are matviews different from views from the SQL standard's perspective? I tried looking through the standard to figure it out (and I admit that I probably missed something), but the only thing appears to be a statement in the standard that (paraphrased) functions are run with the view is queried and that strikes me as a relatively minor point.. To me, the main criterion is that you cannot DROP VIEW a materialized view. That is an entirely correctable situation. We don't require 'DROP UNLOGGED TABLE'. Generally, if the information schema claims that a view/table/function/etc. named foo exists, then I should be able to operate on foo using the basic operations for a view/table/function/etc. of that name. I think think DROP VIEW is a basic operation for a view. Others might disagree. This strikes me as a reason to allow DROP VIEW and perhaps other operations against a matview, not as a reason why matviews aren't views. More subtly, if we claim that a materialized view is a view, then we cannot have asynchronously updated materialized views, because then we have different semantics. This is, at least, a reason I can understand, though I'm not sure I see it as sufficient to say matviews are so different from views that they shouldn't be listed as such. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)
On Fri, Oct 17, 2014 at 3:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-10-16 10:06:59 -0400, Tom Lane wrote: No, it wasn't. I'm not convinced either that that patch will get in at all, or that it has to have regression tests of that particular form, or that such a switch would be sufficient to make such tests platform independent. Why should the EXPLAIN ANALYZE output without timing information be less consistent for sensibly selected cases than EXPLAIN itself? To take just one example, the performance numbers that get printed for a sort, such as memory consumption, are undoubtedly platform-dependent. Maybe your idea of sensibly selected cases excludes sorting, but I don't find such an argument terribly convincing. I think if we go down this road, we are going to end up with an EXPLAIN that has one hundred parameters turning on and off tiny pieces of the output, none of which are of any great use for anything except the regression tests. I don't want to go there. It would be a lot better to expend the effort on a better regression testing infrastructure that wouldn't *need* bitwise-identical output across platforms. (mysql is ahead of us in that department: they have some hacks for selective matching of the output.) I saw this, and I was about to ask the same question as Andres I think I now see what you're worried about. Next we'd need a flag to disable external disk sort sizes too... Perhaps we could introduce some sort of wildcard matching in the regression tests. So that we could stick something like: Execution time: * ms Into the expected results, though, probably we'd need to come up with some wildcard character which is a bit less common than * It might be hard to generate a useful diff with this for when a test fails, but maybe it'd be good enough to just run the 2 files through this wildcard matching programme and then just do a diff if it fails. Regards David Rowley
Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)
David Rowley dgrowle...@gmail.com writes: On Fri, Oct 17, 2014 at 3:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: I don't want to go there. It would be a lot better to expend the effort on a better regression testing infrastructure that wouldn't *need* bitwise-identical output across platforms. (mysql is ahead of us in that department: they have some hacks for selective matching of the output.) Perhaps we could introduce some sort of wildcard matching in the regression tests. So that we could stick something like: Execution time: * ms Into the expected results, though, probably we'd need to come up with some wildcard character which is a bit less common than * I was imagining that we might as well allow regexp matching, so you could be as specific as Execution time: \d+\.\d+ ms if you had a mind to. But with or without that, it would be difficult to pick a meta-character that never appears in expected-output files today. What we'd probably want to do (again, I'm stealing ideas from what I remember of the mysql regression tests) is add metasyntax to switch between literal and wildcard/regexp matching. So perhaps an expected file could contain something like -- !!match regexp ... expected output including regexps ... -- !!match literal ... normal expected output here Not sure how we get there without writing our own diff engine though :-(. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizer on sort aggregate
Hi, David, Yes, switch sorting order would loose an interesting order so if user dictates order by t, i; planner need to resort to its cost model. Estimating cardinality of groupby is a much bigger topic than this thread. I feel sorting string as if it is bytea is particularly interesting. I am aware Peter G's patch and I think it is great, but for this sort agg case, first, I believe it is still slower than sorting bytea, and second, Peter G's patch depends on data. A common long prefix will make the patch less effective, which is probably not so uncommon (for example, URL with a domain prefix). I don't see any downside of sort bytea, other than lost the interest ordering. Thanks, Feng On Fri, Oct 17, 2014 at 4:36 PM, David Rowley dgrowle...@gmail.com wrote: On Sat, Oct 18, 2014 at 5:10 AM, Feng Tian feng...@gmail.com wrote: Hi, Consider the following queries. create table t(i int, j int, k int, t text); insert into t select i, i % 100, i % 1000, 'AABBCCDD' || i from generate_series(1, 100) i; ftian=# explain select count(distinct j) from t group by t, i; QUERY PLAN GroupAggregate (cost=158029.84..178029.84 rows=100 width=22) - Sort (cost=158029.84..160529.84 rows=100 width=22) Sort Key: t, i - Seq Scan on t (cost=0.00..17352.00 rows=100 width=22) (4 rows) The query, select count(distinct j) from t group by t, i; runs for 35 seconds. However, if I change the query to, select count(distinct j) from t group by i, t; -- note switching the ordering select count(distinct j) from t group by decode(t, 'escape'), i; -- convert t to bytea Run times are just about 5 and 6.5 seconds. The reason is clear, compare a string with collation is slow, which is well understood by pg hackers. However, here, the sorting order is forced by the planner, not user. Planner can do the following optimizations, 1. for the sort we generated for sort agg, planner can switch column ordering, put int before string, 2. for the sort we generated for sort agg, use bytea compare instead of string compare. They will bring big improvement to this common query. Is this something reasonable? It seems like it might be worth looking into, but I think it's likely more complex than sorting the group by order into narrowest column first. If the query was: select count(distinct j) from t group by t, i order by t, i; Then if that was rewritten to make the group by i,t then the planner would then need to perform an additional sort on t,i to get the correct order by for the final result, which may or may not be faster, it would depend on how many groups there were to sort. Though I guess you could possibly just skip the optimisation if the next node up didn't need any specific ordering. I also wonder if taking into account the column's width is not quite enough. For example if the 'i' column only had 1 distinct value, then performing the group by on 'i' first wouldn't help at all. Keep in mind that the columns could be much closer in width than in your example, e.g int and bigint. Perhaps you'd need to include some sort of heuristics to look at the number of distinct values and the width, and form some sort of weighted estimates about which column to put first. Regards David Rowley
Re: [HACKERS] Directory/File Access Permissions for COPY and Generic File Access Functions
On Thu, Oct 16, 2014 at 12:01:28PM -0400, Stephen Frost wrote: This started out as a request for a non-superuser to be able to review the log files without needing access to the server. Now, things can certainly be set up on the server to import *all* logs and then grant access to a non-superuser, but generally it's I need to review the log from X to Y and not *all* logs need to be stored or kept in PG. Why is this patch showing up before being discussed? You are having to back into the discusion because of this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers