Re: [HACKERS] Proposal - Support for National Characters functionality
On Mon, Jul 15, 2013 at 4:37 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Yes, what I know almost all use utf8 without problems. Long time I didn't see any request for multi encoding support. Well, not *everything* can be represented as UTF-8; I think this is particularly an issue with Asian languages. What cannot be represented as UTF-8? UTF-8 can represent every character in the Unicode character set, whereas UTF-16 can encode characters 0 to 0x10. Does support for alternative multi-byte encodings have something to do with the Han unification controversy? I don't know terribly much about this, so apologies if that's just wrong. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regex pattern with shorter back reference does NOT work as expected
Hi Tom, On Sat, Jul 13, 2013 at 10:43 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Jeevan Chalke jeevan.cha...@enterprisedb.com writes: Following example does not work as expected: -- Should return TRUE but returning FALSE SELECT 'Programmer' ~ '(\w).*?\1' as t; This is clearly broken, but I'm uncomfortable with the proposed patch. As written, it changes behavior for both the shortest-match-preferred and longest-match-preferred cases; but you've offered no evidence that the longest-match case is broken. Maybe it is --- it's sure not obvious why it's okay to abandon the search early in this case. But I think we'd have been likely to hear about it before now if there were a matching failure in that path, since longest-preferred is so much more widely used. After reflecting on this for awhile, I think that the longest-preferred case is indeed also wrong in theory, but it's unreachable, which explains the lack of bug reports. To get to the no point in trying again code, we have to have a success of the DFA match followed by a failure of the cdissect match, which essentially means that the string would match if we didn't have to constrain some backref to exactly match the string matched by its referent. Now, in the longest-preferred case, the supposed early exit test is end == begin, ie the tentative match was of zero length overall. I can't envision any situation in which a backref constraint could fail given that, because both the referent pattern piece and the backref piece would have to be matching zero-length substrings as well. There could be anchor constraints, lookahead constraints, and so forth in play, but those would all have been checked by the DFA, so they're not going to result in cdissect failures. Hence, the end == begin test will simply never succeed. Thanks for the explanation. For last couple of days I was trying hard to find a test-case which triggers end == begin but didn't find one. This explanation proves that it will never reach that. So I give up now. On the other hand, the check made in the shortest-preferred case is going to succeed if the tentative match was of maximal not minimal length, and that's certainly a possible case. So I think your patch is right, although I'd be inclined to refactor the code to have just one test on shorter, like this: /* go around and try again, if possible */ if (shorter) { if (end == estop) break; estart = end + 1; } else { if (end == begin) break; estop = end - 1; } so as to make it clearer that we're just defending against selecting an impossible new estart or estop location. (Although I argued above that the end == begin test can't succeed, I wouldn't care to remove it entirely here.) OK. Looks good to me. Thanks regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Jeevan B Chalke Senior Software Engineer, RD EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: +91 20 30589500 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Re: [HACKERS] Proposal - Support for National Characters functionality
On Mon, Jul 15, 2013 at 4:37 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Yes, what I know almost all use utf8 without problems. Long time I didn't see any request for multi encoding support. Well, not *everything* can be represented as UTF-8; I think this is particularly an issue with Asian languages. What cannot be represented as UTF-8? UTF-8 can represent every character in the Unicode character set, whereas UTF-16 can encode characters 0 to 0x10. Does support for alternative multi-byte encodings have something to do with the Han unification controversy? I don't know terribly much about this, so apologies if that's just wrong. There's a famous problem regarding conversion between Unicode and other encodings, such as Shift Jis. There are lots of discussion on this. Here is the one from Microsoft: http://support.microsoft.com/kb/170559/EN-US -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - Support for National Characters functionality
On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Yes, what I know almost all use utf8 without problems. Long time I didn't see any request for multi encoding support. Well, not *everything* can be represented as UTF-8; I think this is particularly an issue with Asian languages. If we chose to do it, I think that per-column encoding support would end up looking a lot like per-column collation support: it would be yet another per- column property along with typoid, typmod, and typcollation. I'm not entirely sure it's worth it, although FWIW I do believe Oracle has something like this. Yes, the idea is that users will be able to declare columns of type NCHAR or NVARCHAR which will use the pre-determined encoding type. If we say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding irrespective of the database encoding. It will be up to us to restrict what Unicode encodings we want to support for NCHAR/NVARCHAR columns. This is based on my interpretation of the SQL standard. As you allude to above, Oracle has a similar behaviour (they support UTF-16 as well). Support for UTF-16 will be difficult without linking with some external libraries such as ICU. Can you please elaborate more on this? Why do you exactly need ICU? Also I don't understand why you need UTF-16 support as a database encoding because UTF-8 and UTF-16 are logically equivalent, they are just different represention (encoding) of Unicode. That means if we already support UTF-8 (I'm sure we already do), there's no particular reason we need to add UTF-16 support. Maybe you just want to support UTF-16 as a client encoding? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - Support for National Characters functionality
On Mon, Jul 15, 2013 at 8:58 AM, Tatsuo Ishii is...@postgresql.org wrote: Also I don't understand why you need UTF-16 support as a database encoding because UTF-8 and UTF-16 are logically equivalent, they are just different represention (encoding) of Unicode. That means if we already support UTF-8 (I'm sure we already do), there's no particular reason we need to add UTF-16 support. To be fair, there is a small reason to support UTF-16 even with UTF-8 available. I personally do not find it compelling, but perhaps I am not best placed to judge such things. As Wikipedia says on the the English UTF-8 article: Characters U+0800 through U+ use three bytes in UTF-8, but only two in UTF-16. As a result, text in (for example) Chinese, Japanese or Hindi could take more space in UTF-8 if there are more of these characters than there are ASCII characters. This happens for pure text but rarely for HTML documents. For example, both the Japanese UTF-8 and the Hindi Unicode articles on Wikipedia take more space in UTF-16 than in UTF-8. This is the only advantage of UTF-16 over UTF-8 as a server encoding. I'm inclined to take the fact that there has been so few (no?) complaints from PostgreSQL's large Japanese user-base about the lack of UTF-16 support as suggesting that that isn't considered to be a compelling feature in the CJK realm. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
On 2013-07-14 23:49:47 -0400, Robert Haas wrote: On Fri, Jul 5, 2013 at 4:27 PM, Markus Wanner mar...@bluegap.ch wrote: One way to resolve that - and that seems to be the direction Dimitri's patch is going - is to make the extension depend on its template. (And thereby breaking the mental model of a template, IMO. In the spirit of that mental model, one could argue that code for stored procedures shouldn't be duplicated, but instead just reference its ancestor.) The security problems with this model have been discussed in every email thread about the extension template feature. There is a lot of (well-founded, IMHO) resistance to the idea of letting users install C code via an SQL connection. I still fail how to see that that argument has much merits: CREATE EXTENSION adminpack; CREATE OR REPLACE FUNCTION pg_catalog.pg_binary_file_write(text, bytea, boolean) RETURNS bigint LANGUAGE c STRICT AS '$libdir/adminpack', $function$pg_file_write$function$ SELECT pg_binary_file_write('some-path.so', 'some-so-as-bytea'::bytea, 1); LOAD '/var/lib/postgresql/9.2/main/some-path.so'; All done with the default contribs, without even a PL. It's obviously even more trivial if you have access to plpython or plperlu. You can do similar nastiness without even contrib installed misusing binary COPY or possibly even using tuplesort tapes. A superuser can execute native code as postges user. That's it. Now, I don't think Markus proposal is a good idea on *other* grounds though... but that's for another email. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
Robert, thanks for answering. I think you responded to the (or some) idea in general, as I didn't see any relation to the quoted part. (Otherwise this is a hint that I've not been clear enough.) On 07/15/2013 05:49 AM, Robert Haas wrote (in a slightly different order): There is a lot of (well-founded, IMHO) resistance to the idea of letting users install C code via an SQL connection. Nowhere did I propose anything close to that. I'm in full support of the resistance. Pitchforks and torches ready to rumble. :-) The security problems with this model have been discussed in every email thread about the extension template feature. I read through a lot of these discussions, recently, but mostly found misunderstanding and failure to distinguish between available extension (template) and created extension (instantiation). I think this is a new proposal, which I didn't ever see discussed. It does not have much, if anything, to do with Dimitri's patch, except for it having made the point obvious to me, as it continues to mix the two mental models. I don't see much of a difference security wise between loading the DSO at extension creation time vs. loading it at every backend start. More details below. My major gripe with the current way extensions are handled is that SQL scripts are treated as a template, where as the DSO is only pointed to. Changing the SQL scripts in your extension directory has no effect to the installed extension. Modifying the DSO file certainly has. That's the inconsistency that I'd like to get rid of. A security analysis of the proposal follows. Granted, the internalization of the DSO is somewhat critical to implement. Running as a non-root user, Postgres has less power than that and can certainly not protect the internalized DSO from modification by root. It can, however, store the DSO well protected from other users (e.g. in a directory within $PGDATA). Relying on the external DSO only exactly once can provide additional safety. Consider an extensions directory that's accidentally world-writable. As it stands, an attacker can modify the DSO and gain control as soon as a new connection loads it. With my proposal, the attacker would have to wait until CREATE EXTENSION. A point in time where the newly created extension is more likely to be tested and cross-checked. I'm arguing both of these issues are very minor and negligible, security wise. Baring other issues, I conclude there's no security impact by this proposal. Arguably, internalizing the DSO (together with the SQL) protects way better from accidental modifications (i.e. removing the DSO by un-installing the extension template via the distribution's packaging system while some database still has the extension instantiated). That clearly is a usability and not a security issue, though. If nothing else (and beneficial in either case): Postgres should probably check the permissions on the extension directory and at least emit a warning, if it's world-writable. Or refuse to create (or even load) an extension, right away. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
On 2013-07-15 12:12:36 +0200, Markus Wanner wrote: Granted, the internalization of the DSO is somewhat critical to implement. Running as a non-root user, Postgres has less power than that and can certainly not protect the internalized DSO from modification by root. It can, however, store the DSO well protected from other users (e.g. in a directory within $PGDATA). There's heaps of problems with that though. Think noexec mounts or selinux. Relying on the external DSO only exactly once can provide additional safety. Not necessarily. Think of a distribution provided upgrade with a security fix in an extension. On a machine with dozens of clusters. Now you need to make sure each and every one of them has the new .so. Consider an extensions directory that's accidentally world-writable. As it stands, an attacker can modify the DSO and gain control as soon as a new connection loads it. With my proposal, the attacker would have to wait until CREATE EXTENSION. A point in time where the newly created extension is more likely to be tested and cross-checked. I think protecting against the case where such directories are writable is a rather bad argument. If anything screwed up those the postgres binaries directory itself quite likely has also been changed and such. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
On 07/15/2013 12:05 PM, Andres Freund wrote: A superuser can execute native code as postges user. That's it. Oh, I though Robert meant postgres users, i.e. non-superusers. I hereby withdraw my pitchforks: I'm certainly not opposing to simplify the life of superusers, who already have the power. Now, I don't think Markus proposal is a good idea on *other* grounds though... but that's for another email. Separation of concerns, huh? Good thing, thanks :-) Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
On 2013-07-15 12:20:15 +0200, Markus Wanner wrote: On 07/15/2013 12:05 PM, Andres Freund wrote: A superuser can execute native code as postges user. That's it. Oh, I though Robert meant postgres users, i.e. non-superusers. Oh, I am talking about *postgres* superusers ;). The example provided upthread doesn't require 'root' permissions but only database level superuser permissions. I hereby withdraw my pitchforks: I'm certainly not opposing to simplify the life of superusers, who already have the power. I think what dimitri has in mind could easily be delegating the dangerous work to a suid binary which does policy checking and the real install... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
On 07/15/2013 12:19 PM, Andres Freund wrote: On 2013-07-15 12:12:36 +0200, Markus Wanner wrote: Granted, the internalization of the DSO is somewhat critical to implement. Running as a non-root user, Postgres has less power than that and can certainly not protect the internalized DSO from modification by root. It can, however, store the DSO well protected from other users (e.g. in a directory within $PGDATA). There's heaps of problems with that though. Think noexec mounts or selinux. Good point. Note, however, that internalize doesn't necessarily mean exactly that approach. It could be yet another directory, even system wide, which any distribution should well be able to prepare. I intentionally left the internalizing somewhat undefined. It could even be more equivalent to what is done with the SQL stuff, i.e. some system catalog. But that just poses other implementation issues. (Loading a DSO from memory doesn't sound very portable to me.) Relying on the external DSO only exactly once can provide additional safety. Not necessarily. Think of a distribution provided upgrade with a security fix in an extension. Ugh.. only to figure out the patched DSO is incompatible to the old version of the SQL scripts? And therefore having to re-create the extension, anyways? That only highlights why this difference in treatment of SQL and DSO is troublesome. On a machine with dozens of clusters. Now you need to make sure each and every one of them has the new .so. Agreed. So this sounds like the other approach to unification may be more useful: Linking the SQL scripts better and make them (re-)load from the extensions directory, just like the DSOs. I think protecting against the case where such directories are writable is a rather bad argument. I agree. That's why I'm neglecting he security implications and stated there are none. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: enable new error fields in plpgsql (9.4)
On Wed, Jul 3, 2013 at 5:16 PM, Noah Misch n...@leadboat.com wrote: On Wed, Jul 03, 2013 at 06:17:18AM +0200, Pavel Stehule wrote: 2013/7/2 Noah Misch n...@leadboat.com: Here's a revision bearing the discussed name changes and protocol documentation tweaks, along with some cosmetic adjustments. If this seems good to you, I will commit it. +1 Done. Rushabh, I neglected to credit you as a reviewer and realized it too late. Sorry about that. Sorry somehow I missed this thread. Thanks Noah. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Rushabh Lathia
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 2013-07-15 16:34:07 +0400, ivan babrou wrote: Is there any hope to see it in libpq? If so, can anyone review latest version of my patch? A littlebit of patience please. There are heaps of other patches still being reviewed that have been posted earlier than yours ;). I know it can be frustrating, but that doesn't make reviewer time grow on trees... http://wiki.postgresql.org/wiki/CommitFest Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add regression tests for SCHEMA
On 9 July 2013 08:57, Fabien COELHO coe...@cri.ensmp.fr wrote: I cannot apply the patch, it seems to be truncated: Hi Fabien, Please find attached the updated patch. It seems the only difference between v5 and v6 is probably a newline at the end (Line 291 was the last line), in fact meld doesn't even show anything. I'll try to be more careful though. git reset --hard HEAD git pull patch -p1 ../regress_schema_v6.patch patch -p1 -R ../regress_schema_v6.patch patch -p1 ../regress_schema_v6.patch make clean ./configure --enable-depend --enable-coverage --enable-cassert --enable-debug make -j3 check Do let me know if something is still amiss. -- Robins Tharakan
Re: [HACKERS] Patch to add regression tests for SCHEMA
Hi Fabien, Please find attached the updated patch. It seems the only difference between v5 and v6 is probably a newline at the end (Line 291 was the last line), in fact meld doesn't even show anything. I'll try to be more careful though. git reset --hard HEAD git pull patch -p1 ../regress_schema_v6.patch patch -p1 -R ../regress_schema_v6.patch patch -p1 ../regress_schema_v6.patch make clean ./configure --enable-depend --enable-coverage --enable-cassert --enable-debug make -j3 check Do let me know if something is still amiss. -- Robins Tharakan regress_schema_v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add some regression tests for SEQUENCE
On 9 July 2013 08:41, Fabien COELHO coe...@cri.ensmp.fr wrote: I do not see any difference between both regress_sequence_v[45].patch**. I guess you sent the earlier version. Thanks Fabien. This was a wrong attachment to the email. Please find attached the updated patch (I've renamed v5 as v6 for clarity). git reset --hard HEAD git pull patch -p1 ../regress_sequence_v6.patch patch -p1 -R ../regress_sequence_v6.patch patch -p1 ../regress_sequence_v6.patch make clean ./configure --enable-depend --enable-coverage --enable-cassert --enable-debug make -j3 check -- Robins Tharakan regress_sequence_v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
Robert, On 07/15/2013 12:12 PM, Markus Wanner wrote: On 07/15/2013 05:49 AM, Robert Haas wrote (in a slightly different order): There is a lot of (well-founded, IMHO) resistance to the idea of letting users install C code via an SQL connection. Nowhere did I propose anything close to that. I'm in full support of the resistance. Pitchforks and torches ready to rumble. :-) To clarify: In the above agreement, I had the Postgres level ordinary users in mind, who certainly shouldn't ever be able to upload and run arbitrary native code. I'm not quite as enthusiastic if you meant the system level user that happens to have superuser rights on Postgres. As Andres pointed out, there are some more holes to plug, if you want to lock down a superuser account. [1] I'm not quite clear what kind of user you meant in your statement above. Regards Markus Wanner [1]: I see the theoretical benefits of providing yet another layer of security. Closing the existing holes would require restricting LOAD, but that seems to suffice (given the sysadmin makes sure he doesn't accidentally provide any of the harmful extensions). How about the (optional?) ability to restrict LOAD to directories and files that are only root writable? Is that enough for a root to disallow the untrusted Postgres superuser to run arbitrary native code? Or does that still have other holes? How many real use cases does it break? How many does it fix? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for fail-back without fresh backup
ToDo 1. currently this patch supports synchronous transfer. so we can't set different synchronous transfer mode to each server. we need to improve the patch for support following cases. - SYNC standby and make separate ASYNC failback safe standby - ASYNC standby and make separate ASYNC failback safe standby 2. we have not measure performance yet. we need to measure perfomance. Here are the tests results, showing the performance overhead of the patch ( failback_safe_standby_v4.patch ): Tests are carried out in two different scenarios: 1. Tests with Fast transaction workloads. 2. Test with large loads. Test Type-1: Tests with pgbench (*Tests with fast transaction workloads*) Notes: 1. These test are for testing the performance overhead caused by the patch for fast transaction workloads. 2. Tests are performed with the pgbench benchmark and performance measurement factor is the TPS value. 3. Values represents the TPS for 4 runs,and the last value represents the average of all the runs. Settings for tests: transaction type: TPC-B (sort of) scaling factor: 300 query mode: simple number of clients:150 number of threads:1 duration:1800 s Analysis of results: 1) Synchronous Replication :(753.06, 748.81, 748.38, 747.21, Avg- 747.2) 2) Synchronous Replication + Failsafe standby (commit) : (729.13,724.33 , 713.59 , 710.79, Avg- 719.46) 3) Synchronous Replication + Failsafe standby (all) : (692.08, 688.08, 711.23,711.62, Avg- 700.75) 4) Asynchronous Replication :(1008.42, 993.39, 986.80 ,1028.46 , Avg-1004.26 ) 5) Asynchronous Replication + Failsafe standby (commit) : (974.49, 978.60 ,969.11, 957.18 , Avg- 969.84) 6) Asynchronous Replication + Failsafe standby (data_flush) : (1011.79, 992.05, 1030.20,940.50 , Avg- 993.63) In above test results the performance numbers are very close to each other, also because of noise they show variation. hence following is approximate conclusion about the overhead of patch. 1. Streaming replication + synchronous_transfer (all , data_flush):: a) On an average, synchronous replication combined with synchronous_transfer (all ) causes 6.21 % performance overhead. b) On an average, asynchronous streaming replication combined synchronous_transfer (data_flush ) causes averagely 1.05 % performance overhead. 2. Streaming replication + synchronous_transfer (commit): a) On an average, synchronous replication combined with synchronous_transfer (commit ) causes 3.71 % performance overhead. b) On an average, asynchronous streaming replication combined with synchronous_transfer (commit) causes averagely 3.42 % performance overhead. Test Type-2: Tests with pgbench -i (*Tests with large loads:*) Notes: 1. These test are for testing the performance overhead caused by the patch for large loads and index builds. 2. Tests are performed with the pgbench -i (initialization of test data i.e the time taken for creating tables of pgbench, inserting tuples and building primary keys.) 3. The performance measurement factor is the wall clock time for pgbench -i (measured with time command). 4. Values represents the Wall clock time for 4 runs and the last value represents the average of all the runs. pgbench settings: Scale factor: 300 ( Database size - 4.3873 GB) Test results: 1) Synchronous Replication : (126.98, 133.83, 127.77, 129.70, Avg-129.57) (second) 2) Synchronous Replication + synchronous_transfer (commit) : (132.87, 125.85, 133.91, 134.61, Avg-131.81) (second) 3) Synchronous Replication + synchronous_transfer (all) : (133.59, 132.82, 134.20, 135.22, Avg-133.95) (second) 4) Asynchronous Replication : ( 126.75 , 136.95, 130.42, 127.77, 130.47) (second) 5) Asynchronous Replication + synchronous_transfer (commit) : (128.13, 133.06, 127.62, 130.70, Avg-129.87) (second) 6) Asynchronous Replication + synchronous_transfer (data_flush) : (134.55 , 139.90, 144.47, 143.85, Avg-140.69) (second) In above test results the performance numbers are very close to each other, also because of noise they show variation. hence following is approximate conclusion about the overhead of patch. 1. Streaming replication + synchronous_transfer (all , data_flush):: a) On an average, synchronous replication combined with synchronous_transfer (all ) causes 3.38 % performance overhead. b) On an average, asynchronous streaming replication combined synchronous_transfer (data_flush ) causes averagely 7.83 % performance overhead. 2. Streaming replication + synchronous_transfer (commit): a) On an average, synchronous replication combined with synchronous_transfer (commit ) causes 1.72 % performance overhead. b) On an average, asynchronous streaming replication combined with synchronous_transfer (commit) causes averagely (-0.45) % performance overhead. The test results for both the cases (large loads and fast transactions) shows variation because of noise, But we can observe that approximately patch causes 3-4% performance overhead. Regards,
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Sun, Jul 7, 2013 at 9:24 AM, Simon Riggs si...@2ndquadrant.com wrote: On 3 January 2012 18:42, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Another point that requires some thought is that switching SnapshotNow to be MVCC-based will presumably result in a noticeable increase in each backend's rate of wanting to acquire snapshots. BTW, I wonder if this couldn't be ameliorated by establishing some ground rules about how up-to-date a snapshot really needs to be. Arguably, it should be okay for successive SnapshotNow scans to use the same snapshot as long as we have not acquired a new lock in between. If not, reusing an old snap doesn't introduce any race condition that wasn't there already. Now that has been implemented using the above design, we can resubmit the lock level reduction patch, with thanks to Robert. Submitted patch passes original complaint/benchmark. Changes * various forms of ALTER TABLE, notably ADD constraint and VALIDATE * CREATE TRIGGER One minor coirrections to earlier thinking with respect to toast tables. That might be later relaxed. Full tests including proof of lock level reductions, plus docs. I'm quite glad to see this patch back again for another round. I haven't reviewed it in detail yet, so the purpose of this email is just to lay out some general areas of concern and food for thought rather than to critique anything in the patch too specifically. Generally, the question on the table is: to what extent do MVCC catalog scans make the world safe for concurrent DDL, or put negatively, what hazards remain? Noah discovered an interesting one recently: apparently, the relcache machinery has some logic in it that depends on the use of AccessExclusiveLock in subtle ways. I'm going to attempt to explain it, and maybe he can jump in and explain it better. Essentially, the problem is that when a relcache reload occurs, certain data structures (like the tuple descriptor, but there are others) are compared against the old version of the same data structure. If there are no changes, we do nothing; else, we free the old one and install the new one. The reason why we don't free the old one and install the new one unconditionally is because other parts of the backend might have pointers to the old data structure, so just replacing it all the time would result in crashes. Since DDL requires AccessExclusiveLock + CheckTableNotInUse(), any actual change to the data structure will happen when we haven't got any pointers anyway. A second thing I'm concerned about is that, although MVCC catalog scans guarantee that we won't miss a concurrently-updated row entirely, or see a duplicate, they don't guarantee that the rows we see during a scan of catalog A will be consistent with the rows we see during a scan of catalog B moments later. For example, hilarity will ensue if relnatts doesn't match what we see in pg_attribute. Now I don't think this particular example matters very much because I think there are probably lots of other things that would also break if we try to add a column without a full table lock, so we're probably doomed there anyway. But there might be other instances of this problem that are more subtle. I'll try to find some time to look at this in more detail. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for ROLE (USER)
On 6 July 2013 20:25, Robins Tharakan thara...@gmail.com wrote: Do let me know your view on this second point, so that I can remove these tests if so required. Hi, Please find attached the updated patch. It address the first issue regarding reducing the repeated CREATE / DROP ROLEs. It still doesn't address the excessive (syntactical) checks tough. I am still unclear as to how to identify which checks to skip. (As in, although I have a personal preference of checking everything, my question probably wasn't clear in my previous email. I was just asking 'how' to know which checks to not perform.) Should a 'syntax error' in the message be considered as an unnecessary check? If so, its easier for me to identify which checks to skip, while creating future tests. -- Robins Tharakan regress_role_v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 2013-07-15 10:06:31 -0400, Robert Haas wrote: Noah discovered an interesting one recently: apparently, the relcache machinery has some logic in it that depends on the use of AccessExclusiveLock in subtle ways. I'm going to attempt to explain it, and maybe he can jump in and explain it better. Essentially, the problem is that when a relcache reload occurs, certain data structures (like the tuple descriptor, but there are others) are compared against the old version of the same data structure. If there are no changes, we do nothing; else, we free the old one and install the new one. The reason why we don't free the old one and install the new one unconditionally is because other parts of the backend might have pointers to the old data structure, so just replacing it all the time would result in crashes. Since DDL requires AccessExclusiveLock + CheckTableNotInUse(), any actual change to the data structure will happen when we haven't got any pointers anyway. Aren't we swapping in the new data on a data level for that reason? See RelationClearRelation(). A second thing I'm concerned about is that, although MVCC catalog scans guarantee that we won't miss a concurrently-updated row entirely, or see a duplicate, they don't guarantee that the rows we see during a scan of catalog A will be consistent with the rows we see during a scan of catalog B moments later. For example, hilarity will ensue if relnatts doesn't match what we see in pg_attribute. Now I don't think this particular example matters very much because I think there are probably lots of other things that would also break if we try to add a column without a full table lock, so we're probably doomed there anyway. But there might be other instances of this problem that are more subtle. Hm. Other transactions basically should be protected against this because they can't se uncommitted data anyway, right? ISTM that our own session basically already has be safe against hilarity of this kind, right? I am concerned about stuff like violating constraints because we haven't yet seen the new constraint definition and the like... Or generating wrong plans because we haven't seen that somebody has dropped a constraint and inserted data violating the old one. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vacuumlo - use a cursor
On Sun, Jul 7, 2013 at 9:30 AM, Josh Kupershmidt schmi...@gmail.com wrote: On Mon, Nov 12, 2012 at 5:14 PM, Andrew Dunstan and...@dunslane.net wrote: vacuumlo is rather simpleminded about dealing with the list of LOs to be removed - it just fetches them as a straight resultset. For one of my our this resulted in an out of memory condition. Wow, they must have had a ton of LOs. With about 2M entries to pull from vacuum_l, I observed unpatched vacuumlo using only about 45MB RES. Still, the idea of using a cursor for the main loop seems like a reasonable idea. The attached patch tries to remedy that by using a cursor instead. If this is wanted I will add it to the next commitfest. The actualy changes are very small - most of the patch is indentation changes due to the introduction of an extra loop. I had some time to review this, some comments about the patch: 1. I see this new compiler warning: vacuumlo.c: In function ‘vacuumlo’: vacuumlo.c:306:5: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 4 has type ‘long int’ [-Wformat] 2. It looks like the the patch causes all the intermediate result-sets fetched from the cursor to be leaked, rather negating its stated purpose ;-) The PQclear() call should be moved up into the main loop. With this fixed, I confirmed that vacuumlo now consumes a negligible amount of memory when chewing through millions of entries. 3. A few extra trailing whitespaces were added. 4. The FETCH FORWARD count comes from transaction_limit. That seems like a good-enough guesstimate, but maybe a comment could be added to rationalize? Some suggested changes attached with v2 patch (all except #4). I've committed this version of the patch, with a slight change to one of the error messages. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - Support for National Characters functionality
On 7/15/13 1:26 AM, Arulappan, Arul Shaji wrote: Yes, the idea is that users will be able to declare columns of type NCHAR or NVARCHAR which will use the pre-determined encoding type. If we say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding irrespective of the database encoding. It will be up to us to restrict what Unicode encodings we want to support for NCHAR/NVARCHAR columns. I would try implementing this as an extension at first, with a new data type that is internally encoded differently. We have citext as precedent for successfully implementing text-like data types in user space. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add more regression tests for dbcommands
On Sun, Jul 7, 2013 at 10:35 AM, Robins Tharakan thara...@gmail.com wrote: On 26 June 2013 01:55, Robins Tharakan thara...@gmail.com wrote: Code coverage improved from 36% to 68%. Updated patch: - Renamed ROLEs as per Robert's feedback (prepend regress_xxx) - Added test to serial_schedule (missed out earlier). Databases - like roles - are global objects, so those should be similarly prefixed with regress. This is very dangerous: +DROP DATABASE db_db2; -- doesn't exist +DROP DATABASE IF EXISTS db_db2; -- doesn't exist with IF EXISTS; +DROP DATABASE template1;-- can't drop a template database +DROP DATABASE regression; -- can't drop a database in use I think we should drop the first three of these. The first one risks dropping actual user data in the make installcheck case, as does the second one. We can reduce the risk of death and destruction by choosing a better database name, but I don't think it's really worth it. If DROP DATABASE gets broken, we'll notice that soon enough. I don't think the third one is a good test, either. There's no requirement that the user keep template1 around, although nearly everyone probably does. Please see if you can't also do something to minimize the number of create/drop role cycles this patch performs - like maybe to just one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Mon, Jul 15, 2013 at 10:32 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-07-15 10:06:31 -0400, Robert Haas wrote: Noah discovered an interesting one recently: apparently, the relcache machinery has some logic in it that depends on the use of AccessExclusiveLock in subtle ways. I'm going to attempt to explain it, and maybe he can jump in and explain it better. Essentially, the problem is that when a relcache reload occurs, certain data structures (like the tuple descriptor, but there are others) are compared against the old version of the same data structure. If there are no changes, we do nothing; else, we free the old one and install the new one. The reason why we don't free the old one and install the new one unconditionally is because other parts of the backend might have pointers to the old data structure, so just replacing it all the time would result in crashes. Since DDL requires AccessExclusiveLock + CheckTableNotInUse(), any actual change to the data structure will happen when we haven't got any pointers anyway. Aren't we swapping in the new data on a data level for that reason? See RelationClearRelation(). Look at the keep_tupdesc and keep_rules variables. A second thing I'm concerned about is that, although MVCC catalog scans guarantee that we won't miss a concurrently-updated row entirely, or see a duplicate, they don't guarantee that the rows we see during a scan of catalog A will be consistent with the rows we see during a scan of catalog B moments later. For example, hilarity will ensue if relnatts doesn't match what we see in pg_attribute. Now I don't think this particular example matters very much because I think there are probably lots of other things that would also break if we try to add a column without a full table lock, so we're probably doomed there anyway. But there might be other instances of this problem that are more subtle. Hm. Other transactions basically should be protected against this because they can't se uncommitted data anyway, right? ISTM that our own session basically already has be safe against hilarity of this kind, right? Other transactions are protected only within a single scan. If they do two or more separate scans one after the another, some other transaction can commit in the middle of things. Any commits that happen after starting the first scan and before starting the second scan will be reflected in the second scan, but not in the first. That's what I'm concerned about. Our own session can't have a problem of this kind, because we'll never commit a subtransaction (or, heaven forbid, a top-level transaction) while in the middle of a system catalog scan. I am concerned about stuff like violating constraints because we haven't yet seen the new constraint definition and the like... Or generating wrong plans because we haven't seen that somebody has dropped a constraint and inserted data violating the old one. Yes, we need to carefully audit the commands for dependencies of that type. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for ROLE (USER)
On Mon, Jul 15, 2013 at 10:23 AM, Robins Tharakan thara...@gmail.com wrote: It still doesn't address the excessive (syntactical) checks tough. I am still unclear as to how to identify which checks to skip. (As in, although I have a personal preference of checking everything, my question probably wasn't clear in my previous email. I was just asking 'how' to know which checks to not perform.) Should a 'syntax error' in the message be considered as an unnecessary check? If so, its easier for me to identify which checks to skip, while creating future tests. Yes, I think you can just leave out the syntax errors. I simply don't understand how we can be getting any meaningful test coverage out of those cases. I mean, if we want to check every bit of syntax that could lead to a syntax error, we could probably come up with a near-infinite number of test cases: CREAT TABLE foo (x int); CREATE TABL foo (x int); CREATER TABLE foo (x int); CREATE TABLES foo (x int); CREATE CREATE TABLE foo (x int); CREATE TABLE foo [x int); CREATE TABLE foo (x int]; CREATE TABLE foo [x int]; CREATE TABLE (x int); CREATE foo (x int); ...and on and on it goes. Once we start speculating that bison doesn't actually produce a grammar that matches the productions written in gram.y, there's no limit to what can go wrong, and no amount of test coverage can be too large. In practice, the chances of catching anything that way seem minute. If bison breaks in such a way that all currently accepted grammatical constructs are still accepted and work, but something that shouldn't have been accepted is, we'll just have to find that out in some way other than our regression tests. I think it's very unlikely that such a thing will happen, and even if it does, I don't really see any reason to suppose that the particular tests you've included here will be the ones that catch the problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for ROLE (USER)
I simply don't understand how we can be getting any meaningful test coverage out of those cases. I mean, if we want to check every bit of syntax that could lead to a syntax error, we could probably come up with a near-infinite number of test cases: I think that it would be enough to check for expected keywords/identifier/stuff whether the syntax error reported make sense. Basically the parser reports the first found inconsistency. 1. CREAT TABLE foo (x int); 2. CREATE TABL foo (x int); 3. CREATER TABLE foo (x int); -- same as 1 4. CREATE TABLES foo (x int); -- same as 2 5. CREATE CREATE TABLE foo (x int); -- hmmm. 6. CREATE TABLE foo [x int); 7. CREATE TABLE foo (x int]; 8. CREATE TABLE foo [x int]; -- same as 6 7 9. CREATE TABLE (x int); A. CREATE foo (x int); -- same as 2 This level of testing can be more or less linear in the number of token. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen/notify across clusters
On 10 Jul 2013, at 19:26, Josh Berkus j...@agliodbs.com wrote: Huh? LISTEN/NOTIFY across replication has been a desired feature since we introduced streaming replication. We want it, there's just no obvious way to do it. Your email kinda implies that it's not desirable. Thanks Josh. I was under impression that it is desirable. It certainly makes sense to be used with the existing database infrastructure. In terms of features, apart from separating LISTEN so that it can be actually used on Standbys, wouldn't it be a matter of including the notifications in the WAL stream, as simple packets ? This would guarantee same behaviour as on the master. -- GJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
Is there any hope to see it in libpq? If so, can anyone review latest version of my patch? On 10 July 2013 11:49, ivan babrou ibob...@gmail.com wrote: On 9 July 2013 18:43, Andres Freund and...@2ndquadrant.com wrote: On 2013-07-05 21:28:59 +0400, ivan babrou wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 18fcb0c..58c1a35 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -1452,7 +1452,7 @@ static int connectDBComplete(PGconn *conn) { PostgresPollingStatusType flag = PGRES_POLLING_WRITING; - time_t finish_time = ((time_t) -1); + struct timeval finish_time; if (conn == NULL || conn-status == CONNECTION_BAD) return 0; @@ -1462,17 +1462,14 @@ connectDBComplete(PGconn *conn) */ if (conn-connect_timeout != NULL) { - int timeout = atoi(conn-connect_timeout); + int timeout_usec = (int) (atof(conn-connect_timeout) * 100); I'd rather not use a plain int for storing usecs. An overflow is rather unlikely, but still. Also, I'd rather use something like USECS_PER_SEC instead of a plain 100 in multiple places. - if (timeout 0) + if (timeout_usec 0) { - /* - * Rounding could cause connection to fail; need at least 2 secs - */ - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; Accordingly adjust this. Looks like a sensible thing to me. *Independent* from this patch, you might want look into server-side connection pooling using transaction mode. If that's applicable for your application it might reduce latency noticeably. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I tried to make it more safe. Still not sure about constants, I haven't found any good examples in libpq. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add tests for LOCK TABLE
On Sun, Jul 7, 2013 at 12:17 PM, Robins Tharakan thara...@gmail.com wrote: Updated the patch: - Updated ROLEs as per Robert's feedback You managed to use a naming convention different from the one that you used before. Before, you had regress_rol_op1; here, you've got regress_lock_rol1. Consistency may be the hobgoblin of little minds, but grep's mind is very little. - Added test to serial_schedule. When you add the test to serial_schedule, you're supposed to add it to the same place that it occupies in the parallel schedule, more or less, not just add it to the bottom of the file. The idea is that the two files should roughly correspond. We should probably automate that, but for now, this is how it is. I have committed this patch with substantial simplifications and a few other tweaks that I thought would improve test coverage. I feel that this patch wasn't prepared as carefully as it could have been. For example, consider this expected output: +-- Should work. Ensure that LOCK works for inherited tables; +CREATE ROLE regress_lock_rol3; +CREATE TABLE lock_tbl3 (a BIGINT); +CREATE TABLE lock_tbl4 (b BIGINT) INHERITS (lock_tbl3); +BEGIN TRANSACTION; +LOCK TABLE lock_tbl4 * IN access EXCLUSIVE MODE; +SET ROLE regress_lock_rol3; +SET search_path = lock_schema1; +LOCK TABLE lock_tbl3 NOWAIT; +ERROR: relation lock_tbl3 does not exist +ROLLBACK; +RESET ROLE; The comment asserts that this should work, but only part of it works. Also, the failure is evidently intending to test whether regress_lock_rol3 can lock lock_tbl3 despite lack of privileges on that object, but the error message is complaining about something different - namely, that regress_lock_rol3 doesn't even have permissions on the schema. So the thing you meant to test isn't really being tested. The comment block at the top of the file was obviously cut and pasted from somewhere else without being modified to reflect reality. I altogether removed the last block - with lock_tbl{7,8,9,10} - because the comment asserts that it should fail yet no statement in that chunk actually fails, and locking on tables with inheritance is test higher up. I did however modify the inheritance test to use a multi-level inheritance hierarchy, since that does seem worth verifying. I also removed the test that simply checked whether a user without permissions could lock the table; there's already a similar check in privileges.sql. All in all, I'm starting to get a bit skeptical of your test coverage numbers. How are you deriving those, exactly? I agree that the tests in that patch as committed are worthwhile, but they don't seem sufficient to raise the coverage from 57% to 84% of... whatever those numbers are percentages of. Now maybe in simplifying this down I simplified away something essential, but I can't see what. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for ROLE (USER)
On Mon, Jul 15, 2013 at 11:48 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: I simply don't understand how we can be getting any meaningful test coverage out of those cases. I mean, if we want to check every bit of syntax that could lead to a syntax error, we could probably come up with a near-infinite number of test cases: I think that it would be enough to check for expected keywords/identifier/stuff whether the syntax error reported make sense. Basically the parser reports the first found inconsistency. 1. CREAT TABLE foo (x int); 2. CREATE TABL foo (x int); 3. CREATER TABLE foo (x int); -- same as 1 4. CREATE TABLES foo (x int); -- same as 2 5. CREATE CREATE TABLE foo (x int); -- hmmm. 6. CREATE TABLE foo [x int); 7. CREATE TABLE foo (x int]; 8. CREATE TABLE foo [x int]; -- same as 6 7 9. CREATE TABLE (x int); A. CREATE foo (x int); -- same as 2 This level of testing can be more or less linear in the number of token. Maybe so, but that's still a huge number of regression tests that in practice won't find anything. But they will take work to maintain. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 15 July 2013 15:06, Robert Haas robertmh...@gmail.com wrote: Generally, the question on the table is: to what extent do MVCC catalog scans make the world safe for concurrent DDL, or put negatively, what hazards remain? On Tom's test I've been unable to find a single problem. Noah discovered an interesting one recently: apparently, the relcache machinery has some logic in it that depends on the use of AccessExclusiveLock in subtle ways. I'm going to attempt to explain it, and maybe he can jump in and explain it better. Essentially, the problem is that when a relcache reload occurs, certain data structures (like the tuple descriptor, but there are others) are compared against the old version of the same data structure. If there are no changes, we do nothing; else, we free the old one and install the new one. The reason why we don't free the old one and install the new one unconditionally is because other parts of the backend might have pointers to the old data structure, so just replacing it all the time would result in crashes. Since DDL requires AccessExclusiveLock + CheckTableNotInUse(), any actual change to the data structure will happen when we haven't got any pointers anyway. A second thing I'm concerned about is that, although MVCC catalog scans guarantee that we won't miss a concurrently-updated row entirely, or see a duplicate, they don't guarantee that the rows we see during a scan of catalog A will be consistent with the rows we see during a scan of catalog B moments later. For example, hilarity will ensue if relnatts doesn't match what we see in pg_attribute. Now I don't think this particular example matters very much because I think there are probably lots of other things that would also break if we try to add a column without a full table lock, so we're probably doomed there anyway. But there might be other instances of this problem that are more subtle. If you look at this as a generalised problem you probably can find some issues, but that doesn't mean they apply in the specific cases we're addressing. The lock reductions we are discussing in all cases have nothing at all to do with structure and only relate to various options. Except in the case of constraints, though even there I see no issues as yet. I've no problem waiting awhile to apply while others try to find loopholes. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)
To clarify what state this is all in: Fabien's latest pgbench-throttle-v15.patch is the ready for a committer version. The last two revisions are just tweaking the comments at this point, and his version is more correct than my last one. My little pgbench-delay-finish-v1.patch is a brand new bug fix of sorts that, while trivial, isn't ready for commit yet. I'll start a separate e-mail thread and CF entry for that later. Fabien has jumped into initial review comments of that already below, but the throttle feature isn't dependent on that. The finish delay just proves that the latency spikes I was getting here aren't directly tied to the throttle feature. On 7/14/13 5:42 PM, Fabien COELHO wrote: * ISTM that the impact of the chosen 1000 should appear somewhere. I don't have a problem with that, but I didn't see that the little table you included was enough to do that. I think if someone knows how this type of random generation works, they don't need the comment to analyze the impact. And if they don't know, that comment alone wasn't enough to help them figure it out. That's why I added some terms that might help point the right way for someone who wanted to search for more information instead. The text of pgbench is not really the right place to put a lecture about how to generate numbers with a target probability distribution function. Normally the code comments tries to recommend references for this sort of thing instead. I didn't find a really good one in a quick search though. About 123456 12345 vs 123456.012345: My data parser is usually gnuplot or my eyes, and in both cases the later option is better:-) pgbench-tools uses gnuplot too. If I were doing this again today from scratch, I would recommend using the epoch time format compatible with it you suggested. I need to look into this whole topic a little more before we get into that though. This patch just wasn't the right place to get into that change. About the little patch: Parameter ok should be renamed to something meaningful (say do_finish?). It's saying if the connection finished ok or not. I think exactly what is done with that information is an implementation detail that doesn't need to be exposed to the function interface. We might change how this is tied to PQfinish later. Also, it seems that when timer is exceeded in doCustom it is called with true, but maybe you intended that it should be called with false?? The way timeouts are handled right now is a known messy thing. Exactly what you should do with a client that has hit one isn't obvious. Try again? Close the connection and abort? The code has a way it handles that now, and I didn't want to change it any. it is important to remove the connection because it serves as a marker to know whether a client must run or not. This little hack moved around how clients finished enough to get rid of the weird issue with your patch I was bothered by. You may be right that the change isn't really correct because of how the connection is compared to null as a way to see if it's active. I initially added a more complicated finished state to the whole mess that tracked this more carefully. I may need to return to that idea now. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Thu, 2013-07-11 at 10:51 -0400, Nicholas White wrote: I've attached a revised version that fixes the issues above: I'll get to this soon, sorry for the delay. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for SET xxx
On Sun, Jul 7, 2013 at 1:15 PM, Robins Tharakan thara...@gmail.com wrote: Updated the patch: - Updated ROLE names as per Robert's feedback (regress_xxx) - Added test to serial_schedule Please see my comments on the LOCK tests related to these points and update accordingly. +BEGIN TRANSACTION; +INVALID_COMMAND; +ERROR: syntax error at or near INVALID_COMMAND +LINE 1: INVALID_COMMAND; +^ +SET SESSION AUTHORIZATION regress_role_var1; +ERROR: current transaction is aborted, commands ignored until end of transaction block We have a test for this error message in the transactions test, so I don't think we need another one here. This error is thrown at a very high level in e.g. exec_simple_query. The only dependence on the particular command is in IsTransactionExitStmt() - IOW, there's nothing special about SET, and if we're going to test this for SET ROLE and SET SESSION AUTHORIZATION, then why not for every single command we have? +SET DATESTYLE = ISO, SQL; +ERROR: invalid value for parameter DateStyle: iso, sql +DETAIL: Conflicting datestyle specifications. +SET DATESTYLE = SQL, ISO; +ERROR: invalid value for parameter DateStyle: sql, iso +DETAIL: Conflicting datestyle specifications. +SET DATESTYLE = ISO, POSTGRES; +ERROR: invalid value for parameter DateStyle: iso, postgres +DETAIL: Conflicting datestyle specifications. +SET DATESTYLE = POSTGRES, ISO; ...et cetera, ad nauseum. While there's something to be said for completeness, the chances that future maintainers of this code will update these regression tests with all the new permutations every time new options are added seem low. I recommend testing a representative sample of three of these, or so, and skipping the rest. +SET TIME ZONE INTERVAL '1 month +05:30' HOUR TO MINUTE; +ERROR: invalid value for parameter TimeZone: INTERVAL '@ 1 mon 5 hours 30 mins' Wow, what a terrible error message. This is not your patch's fault, but maybe we should look at improving that? I am inclined to think that these tests should be split up and moved into the files where similar things are being tested already. The SET TIME ZONE tests probably belong in horology.sql, the SET DATESTYLE tests with the related tests in interval.sql, and much of the other stuff in transactions.sql. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XLogInsert scaling, revisited
On Mon, Jul 8, 2013 at 6:16 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Ok, I've committed this patch now. Finally, phew! I found that this patch causes the assertion failure. When I set up simple replication environment and promoted the standby before executing any transaction on the master, I got the following assertion failure. 2013-07-16 02:22:06 JST sby1 LOG: received promote request 2013-07-16 02:22:06 JST sby1 FATAL: terminating walreceiver process due to administrator command 2013-07-16 02:22:06 JST sby1 LOG: redo done at 0/2F0 2013-07-16 02:22:06 JST sby1 LOG: selected new timeline ID: 2 hrk:head-pgsql postgres$ 2013-07-16 02:22:06 JST sby1 LOG: archive recovery complete TRAP: FailedAssertion(!(readOff == (XLogCtl-xlblocks[firstIdx] - 8192) % ((uint32) (16 * 1024 * 1024))), File: xlog.c, Line: 7048) 2013-07-16 02:22:12 JST sby1 LOG: startup process (PID 37115) was terminated by signal 6: Abort trap 2013-07-16 02:22:12 JST sby1 LOG: terminating any other active server processes Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On Fri, Jul 12, 2013 at 5:42 AM, Andres Freund and...@2ndquadrant.com wrote: I'd like to add an Assert like in the attached patch making sure we're in a transaction. Otherwise it's far too easy not to hit an error during development because everything is cached - and syscache usage isn't always obvious from the outside to the naive or the tired. Seems like a good idea to me. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Eliminating PD_ALL_VISIBLE, take 2
On Sun, 2013-07-14 at 23:06 -0400, Robert Haas wrote: Of course, there's a reason that PD_ALL_VISIBLE is not like a normal hint: we need to make sure that inserts/updates/deletes clear the VM bit. But my patch already addresses that by keeping the VM page pinned. I'm of the opinion that we ought to extract the parts of the patch that hold the VM pin for longer, review those separately, and if they're good and desirable, apply them. I'm confused. My patch holds a VM page pinned for those cases where PD_ALL_VISIBLE is currently used -- scans or insert/update/delete. If we have PD_ALL_VISIBLE, there's no point in the cache, right? I am not convinced. I thought about the problem of repeatedly switching pinned VM pages during the index-only scans work, and decided that we could live with it because, if the table was large enough that we were pinning VM pages frequently, we were also avoiding I/O. Of course, this is a logical fallacy, since the table could easily be large enough to have quite a few VM pages and yet small enough to fit in RAM. And, indeed, at least in the early days, an index scan could beat out an index-only scan by a significant margin on a memory-resident table, precisely because of the added cost of the VM lookups. I haven't benchmarked lately so I don't know for sure whether that's still the case, but I bet it is. To check visibility from an index scan, you either need to pin a heap page or a VM page. Why would checking the heap page be cheaper? Is it just other code in the VM-testing path that's slower? Or is there concurrency involved in your measurements which may indicate contention over VM pages? I think this idea is worth exploring, although I fear the overhead is likely to be rather large. We could find out, though. Suppose we simply change XLOG_HEAP2_VISIBLE to emit FPIs for the heap pages; how much does that slow down vacuuming a large table into which many pages have been bulk loaded? Sadly, I bet it's rather a lot, but I'd like to be wrong. My point was that, if freezing needs to emit an FPI anyway, and we combine freezing and PD_ALL_VISIBLE, then using WAL properly wouldn't cost us anything. Whether that makes sense depends on what other combination of proposals makes it in, of course. I agree that we don't want to start adding FPIs unnecessarily. Anyway, thanks for the feedback. Moved out of this 'fest. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Mon, Jul 15, 2013 at 10:43 PM, Amit Kapila amit.kap...@huawei.com wrote: On Friday, July 12, 2013 6:46 PM Amit kapila wrote: On Friday, July 12, 2013 12:02 AM Fujii Masao wrote: On Fri, Jul 5, 2013 at 2:19 PM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, July 03, 2013 2:58 AM Alvaro Herrera wrote: Amit Kapila escribió: I got the following compile warnings. guc.c:5187: warning: no previous prototype for 'validate_conf_option' preproc.y:7746.2-31: warning: type clash on default action: str != In gram.y semicolon (';') was missing, due to which it was not generating proper preproc.y I generally use windows as dev environment, it hasn't shown these warnings. I shall check in linux and correct the same. Fixed. The make installcheck failed when I ran it against the server with wal_level = hot_standby. The regression.diff is *** 27,35 (1 row) show wal_level; ! wal_level ! --- ! minimal (1 row) show authentication_timeout; --- 27,35 (1 row) show wal_level; ! wal_level ! - ! hot_standby (1 row) show authentication_timeout; The tests in alter_system.sql are dependent on default values of postgresql.conf, which is okay for make check but not for installcheck. I shall remove that dependency from testcases. Removed all tests for which values were dependent on postgresql.conf a. removed check of values before reload b. removed parameters which can only set after server restart c. removed check for values after setting to default The regression test of ALTER SYSTEM calls pg_sleep(1) six times. Those who dislike the regression test patches which were proposed in this CF might dislike this repeat of pg_sleep(1) because it would increase the time of regression test. The sleep is used to ensure the effects of pg_reload_conf() can be visible. To avoid increasing time of regression tests, we can do one of below: 1) decrease the time for sleep, but not sure how to safely decide how much to sleep. 2) combine the tests such that only 1 or 2 sleep calls should be used. what's your opinion? Modified to use 2 sleep calls. + /* skip auto conf temporary file */ + if (strncmp(de-d_name, + PG_AUTOCONF_FILENAME, + sizeof(PG_AUTOCONF_FILENAME)) == 0) + continue; Why do we need to exclude the auto conf file from the backup? I think that it should be included in the backup as well as normal postgresql.conf. The original intention was to skip the autoconf temporary file which is created during AlterSystemSetConfigFile() for crash safety. I will change to exclude temp autoconf file. I had modified the check to exclude postgresql.auto.conf.temp file. I have used hardcoded .temp instead of macro, as I don't find other use of same in code. + autofile = fopen(path, PG_BINARY_W); + if (autofile == NULL) + { + fprintf(stderr, _(%s: could not open file \%s\ for writing: %s\n), + progname, path, strerror(errno)); + exit_nicely(); + } + + if (fputs(# Do not edit this file manually! It is overwritten by the ALTER SYSTEM command \n, + autofile) 0) + { + fprintf(stderr, _(%s: could not write file \%s\: %s\n), + progname, path, strerror(errno)); + exit_nicely(); + } + + if (fclose(autofile)) + { + fprintf(stderr, _(%s: could not close file \%s\: %s\n), + progname, path, strerror(errno)); + exit_nicely(); + } You can simplify the code by using writefile(). Yes, it is better to use writefile(). I shall update the patch for same. Modified to use writefile(). Thanks for updating the patch! In the patch, ALTER SYSTEM SET validates the postgresql.conf. I think this is overkill. While ALTER SYSTEM SET is being executed, a user might be modifying the postgresql.conf. That validation breaks this use case. +# This includes the default configuration directory included to support +# ALTER SYSTEM statement. +# +# WARNING: User should not remove below include_dir or directory config, +# as both are required to make ALTER SYSTEM command work. +# Any configuration parameter values specified after this line +# will override the values set by ALTER SYSTEM statement. +#include_dir = 'config' Why do we need to expose this setting to a user? As the warning says, if a user *should not* remove this setting, I think we should not expose it from the beginning and should always treat the postgresql.auto.conf as included
Re: [HACKERS] tab-completion for \lo_import
On Wed, Jul 10, 2013 at 11:28 AM, Josh Kupershmidt schmi...@gmail.com wrote: Is there any reason not to tab-complete the local filename used by the \lo_import command? Small patch to do so attached. Looks good to me. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote: On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote: On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote: Overall I think this patch offers useful additional functionality, in compliance with the SQL spec, which should be handy to simplify complex grouping queries. As I understand this feature, it is syntactic sugar for the typical case of an aggregate with a strict transition function. For example, min(x) FILTER (WHERE y 0) is rigorously equivalent to min(CASE y 0 THEN x END). Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel things with, for example, array_agg(). Is that accurate? I think this is ready for committer. The patch was thorough. I updated applicable comments, revisited some cosmetic choices, and made these functional changes: 1. The pg_stat_statements query jumble should incorporate the filter. 2. The patch did not update costing. I made it add the cost of the filter expression the same way we add the cost of the argument expressions. This makes min(x) FILTER (WHERE y 0) match min(case y 0 THEN x end) in terms of cost, which is a fair start. At some point, we could do better by reducing the argument cost by the filter selectivity. A few choices/standard interpretations may deserve discussion. The patch makes filter clauses contribute to the subquery level chosen to be the aggregation query. This is observable through the behavior of these two standard-conforming queries: select (select count(outer_c) from (values (1)) t0(inner_c)) from (values (2),(3)) t1(outer_c); -- outer query is aggregation query select (select count(outer_c) filter (where inner_c 0) from (values (1)) t0(inner_c)) from (values (2),(3)) t1(outer_c); -- inner query is aggregation query I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that still isn't crystal-clear to me, I mention it in case anyone has a different reading. Distinct from that, albeit in a similar vein, SQL does not permit outer references in a filter clause. This patch permits them; I think that qualifies as a reasonable PostgreSQL extension. --- a/doc/src/sgml/keywords.sgml +++ b/doc/src/sgml/keywords.sgml @@ -3200,7 +3200,7 @@ /row row entrytokenOVER/token/entry -entryreserved (can be function or type)/entry +entrynon-reserved/entry I committed this one-line correction separately. --- a/src/backend/optimizer/plan/planagg.c +++ b/src/backend/optimizer/plan/planagg.c @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context) ListCell *l; Assert(aggref-agglevelsup == 0); - if (list_length(aggref-args) != 1 || aggref-aggorder != NIL) + if (list_length(aggref-args) != 1 || aggref-aggorder != NIL || aggref-agg_filter != NULL) return true;/* it couldn't be MIN/MAX */ /* note: we do not care if DISTINCT is mentioned ... */ I twitched upon reading this, because neither ORDER BY nor FILTER preclude the aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder there back in 2009. All I can figure is that writing max(c ORDER BY x) is so unlikely that we'd too often waste the next syscache lookup. But the same argument would apply to DISTINCT. With FILTER, the rationale is entirely different. The aggregate could well be MIN/MAX; we just haven't implemented the necessary support elsewhere in this file. See attached patch revisions. The first patch edits find_minmax_aggs_walker() per my comments just now. The second is an update of your FILTER patch with the changes to which I alluded above; it applies atop the first patch. Would you verify that I didn't ruin anything? Barring problems, will commit. Are you the sole named author of this patch? That's what the CF page says, but that wasn't fully clear to me from the list discussions. Thanks, nm Tested your changes. They pass regression, etc. :) Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
np, optimising for quality not speed :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Noah Misch said: I twitched upon reading this, because neither ORDER BY nor FILTER preclude the aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder there back in 2009. The bottom line is that I intentionally avoided assuming that an agg with an aggsortop could only be min() or max() and that having an order by clause would always be harmless in such cases. I can't think of an actual use case where it would matter, but I've seen people define some pretty strange aggs recently. So I mildly object to simply throwing away the ORDER BY clause in such cases. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)
* ISTM that the impact of the chosen 1000 should appear somewhere. I don't have a problem with that, but I didn't see that the little table you included was enough to do that. I think if someone knows how this type of random generation works, they don't need the comment to analyze the impact. And if they don't know, that comment alone wasn't enough to help them figure it out. That's why I added some terms that might help point the right way for someone who wanted to search for more information instead. Sure. I agree that comments are not the right place for a lecture about Poisson stochastic processes. Only the 1000 parameter as an impact on the maximum delay that can be incurred with respect do the target average delay, and I think that this information is relevant for a comment. to generate numbers with a target probability distribution function. Normally the code comments tries to recommend references for this sort of thing instead. I didn't find a really good one in a quick search though. Yep. Maybe http://en.wikipedia.org/wiki/Exponential_distribution;. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] checking variadic any argument in parser - should be array
On 07/14/2013 12:28 AM, Pavel Stehule wrote: Hello 2013/7/14 Andrew Dunstan and...@dunslane.net: On 06/29/2013 03:29 PM, Pavel Stehule wrote: 5. This patch has user visibility, i.e. now we are throwing an error when user only says VARIADIC NULL like: select concat(variadic NULL) is NULL; Previously it was working but now we are throwing an error. Well we are now more stricter than earlier with using VARIADIC + ANY, so I have no issue as such. But I guess we need to document this user visibility change. I don't know exactly where though. I searched for VARIADIC and all related documentation says it needs an array, so nothing harmful as such, so you can ignore this review comment but I thought it worth mentioning it. yes, it is point for possible issues in RELEASE NOTES, I am thinking ??? Well, writer of release notes should be aware of this. And I hope he will be. So no issue. Is the behaviour change really unavoidable? Is it really what we want? Nobody seems to have picked up on this except the author and the reviewer. I'd hate us to do this and then surprise people. I'm not sure how many people are using VARIADIC any, but I have started doing so and expect to do so more, and I suspect I'm not alone. It doesn't disallow NULL - it disallow nonarray types on this possition, because there are must be only array type values. Other possible usage created unambiguous behave. so SELECT varfx(VARIADIC NULL) -- is disallowed but SELECT varfx(VARIADIC NULL::text[]) -- is allowed Quite so, I understand exactly what the defined behaviour will be. for example, I can wrote SELECT varfx(10,20,30), but I cannot write SELECT varfx(VARIADIC 10,20,30) - because this behave should be undefined. Can me send, your use case, where this check is unwanted, please. The only question I raised was for the NULL case. If you're not saying VARIADIC NULL then I have no issue. Anyway, nobody else seem to care much (and I suspect very few people are writing VARIADIC any functions anyway, apart from you and me). So I'll see about getting this committed shortly. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] overflow checks optimized away
Xi Wang escribió: Intel's icc and PathScale's pathcc compilers optimize away several overflow checks, since they consider signed integer overflow as undefined behavior. This leads to a vulnerable binary. This thread died without reaching a conclusion. Noah Misch, Robert Haas and Greg Stark each gave a +1 to the patches, but Tom Lane gave them a -inf; so they weren't applied. However, I think everyone walked away with the feeling that Tom is wrong on this. Meanwhile Xi Wang and team published a paper: http://pdos.csail.mit.edu/~xi/papers/stack-sosp13.pdf Postgres is mentioned a number of times in this paper -- mainly to talk about the bugs we leave unfixed. It might prove useful to have usable these guys' STACK checker output available continuously, so that if we happen to introduce more bugs in the future, it alerts us about that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
On Mon, Jul 15, 2013 at 06:56:17PM +, Andrew Gierth wrote: Noah Misch said: I twitched upon reading this, because neither ORDER BY nor FILTER preclude the aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder there back in 2009. The bottom line is that I intentionally avoided assuming that an agg with an aggsortop could only be min() or max() and that having an order by clause would always be harmless in such cases. I can't think of an actual use case where it would matter, but I've seen people define some pretty strange aggs recently. So I mildly object to simply throwing away the ORDER BY clause in such cases. I can't think of another use for aggsortop as defined today. However, on further reflection, min(x ORDER BY y) is not identical to min(x) when the B-tree operator class of the aggsortop can find non-identical datums to be equal. This affects at least min(numeric) and min(interval). min(x) chooses an unspecified x among those equal to the smallest x, while min(x ORDER BY y) can be used to narrow the choice. I will update the comments along those lines and not change semantics after all. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] overflow checks optimized away
On Mon, Jul 15, 2013 at 5:59 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Xi Wang escribió: Intel's icc and PathScale's pathcc compilers optimize away several overflow checks, since they consider signed integer overflow as undefined behavior. This leads to a vulnerable binary. This thread died without reaching a conclusion. Noah Misch, Robert Haas and Greg Stark each gave a +1 to the patches, but Tom Lane gave them a -inf; so they weren't applied. However, I think everyone walked away with the feeling that Tom is wrong on this. Meanwhile Xi Wang and team published a paper: http://pdos.csail.mit.edu/~xi/papers/stack-sosp13.pdf Postgres is mentioned a number of times in this paper -- mainly to talk about the bugs we leave unfixed. It might prove useful to have usable these guys' STACK checker output available continuously, so that if we happen to introduce more bugs in the future, it alerts us about that. Yeah, I think we ought to apply those patches. I don't suppose you have links handy? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] overflow checks optimized away
Robert Haas escribió: On Mon, Jul 15, 2013 at 5:59 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Xi Wang escribió: Intel's icc and PathScale's pathcc compilers optimize away several overflow checks, since they consider signed integer overflow as undefined behavior. This leads to a vulnerable binary. This thread died without reaching a conclusion. Noah Misch, Robert Haas and Greg Stark each gave a +1 to the patches, but Tom Lane gave them a -inf; so they weren't applied. However, I think everyone walked away with the feeling that Tom is wrong on this. Meanwhile Xi Wang and team published a paper: http://pdos.csail.mit.edu/~xi/papers/stack-sosp13.pdf Postgres is mentioned a number of times in this paper -- mainly to talk about the bugs we leave unfixed. It might prove useful to have usable these guys' STACK checker output available continuously, so that if we happen to introduce more bugs in the future, it alerts us about that. Yeah, I think we ought to apply those patches. I don't suppose you have links handy? Sure, see this thread in the archives: first one is at http://www.postgresql.org/message-id/510100aa.4050...@gmail.com and you can see the others in the dropdown (though since the subjects are not shown, only date and author, it's a bit hard to follow. The flat URL is useful.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for DISCARD
On Sat, Jul 6, 2013 at 11:49 PM, Robins Tharakan thara...@gmail.com wrote: Thanks Fabrizio. Although parallel_schedule was a miss for this specific patch, however, I guess I seem to have missed out serial_schedule completely (in all patches) and then thanks for pointing this out. Subsequently Robert too noticed the miss at the serial_schedule end. Please find attached a patch, updated towards serial_schedule / parallel_schedule as well as the role name change as per Robert's feedback on CREATE OPERATOR thread. Aside from some reorganization, this patch just checks four new things: - That DISCARD TEMPORARY works like DISCARD TEMP. - That DISCARD PLANS does not throw an error. - That DISCARD ALL fails from within a transaction. - That DISCARD invalid_option fails. The last of these fails with a parse error and therefore, per discussion on the other thread, is not wanted. I'd be more inclined to include the remaining three tests in the existing test file rather than reorganize things into a new file. Reorganizing code makes back-patching harder, and we do back-patch changes that update the regression tests, and I don't think three new tests are are enough justification to add a whole new file. Possibly the test that DISCARD ALL fails from within a transaction ought to be part of a more general category of tests for PreventTransactionChain(). I notice that we currently have NO regression tests that trip that function; we could consider testing some of the others as well. But it's a bit tricky because the CREATE/DROP DATABASE/TABLESPACE commands manipulate global objects - which is a bit hairy - and COMMIT/ROLLBACK PREPARED require special test setup. However, I think we could test CREATE INDEX CONCURRENTLY, DROP INDEX CONCURRENTLY and CLUSTER in addition to DISCARD ALL. Or at least some of those. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: template-ify (binary) extensions
On Mon, Jul 15, 2013 at 9:27 AM, Markus Wanner mar...@bluegap.ch wrote: To clarify: In the above agreement, I had the Postgres level ordinary users in mind, who certainly shouldn't ever be able to upload and run arbitrary native code. I'm not quite as enthusiastic if you meant the system level user that happens to have superuser rights on Postgres. As Andres pointed out, there are some more holes to plug, if you want to lock down a superuser account. [1] I'm not quite clear what kind of user you meant in your statement above. I'm talking about people who are accessing the database via SQL (with or without superuser privileges) rather than people who are accessing the database via the file system (regardless of which OS user they authenticate as). As things stand today, the only way to get machine code to run inside the Postgres binary is to have local filesystem access. Andres points out that you can install adminpack to obtain local filesystem access, and that is true. But the system administrator can also refuse to allow adminpack, and/or use selinux or other mechanisms to prevent the postgres binary from writing a file with execute permissions. If, however, we change things so that Postgres can grab bits out of a system catalog and map them, with execute permissions, into its own memory space, and then jump to that address, we've effectively installed a backdoor around those types of OS-level security measures. And we've made it quite a lot easier for a bad guy to install a rootkit within postgres. Things aren't quite so bad if we write the bits to a file first and then dynamically load the file. That way at least noexec or similar can provide protection. But it still seems like a pretty dangerous direction. I'm inclined to think any such thing would have to be an extension that administrators could enable at their own risk, rather than something that we enabled by default. Insufficient privilege separation is one of the things that has made security on Microsoft Windows such a huge problem over the last several decades; I don't want us to emulate that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] new row-level lock error messages
In 9af4159f in combination with cb9b66d3 a bunch of error messages were changed from something like SELECT FOR UPDATE/SHARE is not allowed with UNION/INTERSECT/EXCEPT to row-level locks are not allowed with UNION/INTERSECT/EXCEPT because the intermediate state of SELECT FOR UPDATE/SHARE/KEY UPDATE/KEY SHARE is not allowed with UNION/INTERSECT/EXCEPT was presumably considered too bulky. I think that went too far in some cases. For example, the new message row-level locks must specify unqualified relation names has little to do with its original meaning. In general, I find these new wordings to be a loss of clarity. There is no indication on the SELECT man page or in the documentation index what a row-level lock is at all. I would suggest that these changes be undone, except that the old SELECT FOR ... be replaced by a dynamic string that reverse-parses the LockingClause to provide the actual clause that was used. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Monday, July 15, 2013 11:51 PM Fujii Masao wrote: On Mon, Jul 15, 2013 at 10:43 PM, Amit Kapila amit.kap...@huawei.com wrote: On Friday, July 12, 2013 6:46 PM Amit kapila wrote: On Friday, July 12, 2013 12:02 AM Fujii Masao wrote: On Fri, Jul 5, 2013 at 2:19 PM, Amit Kapila amit.kap...@huawei.com wrote: On Wednesday, July 03, 2013 2:58 AM Alvaro Herrera wrote: Amit Kapila escribió: Thanks for updating the patch! In the patch, ALTER SYSTEM SET validates the postgresql.conf. I think this is overkill. I assume you are talking about below code, if I am wrong please point me to the exact code you are reffering: /* * Verify if include_dir for postgresql.auto.conf file exists in * postgresql.conf. If it doesn't exist then warn the user that parameters * changed with this command will not be effective. */ autoconf_errcause = AUTOCONF_DIR_NOT_PARSED; if (!ParseConfigFile(ConfigFileName, NULL, false, 0, LOG, head, tail)) ereport(ERROR, (errmsg(configuration file \%s\ contains errors; parameters changed by this command will not be effective, ConfigFileName))); /* * ignore the list of options as we are intersted to just ensure the * existence of include directive for config folder. */ head = tail = NULL; if (autoconf_errcause == AUTOCONF_DIR_NOT_PARSED) ereport(WARNING, (errmsg(configuration parameters changed by this command will not be effective), errdetail(default include directive for \%s\ folder is not present in postgresql.conf, PG_AUTOCONF_DIR))); This code is to parse postgresql.conf to check if include directive for config directory is present. While ALTER SYSTEM SET is being executed, a user might be modifying the postgresql.conf. That validation breaks this use case. +# This includes the default configuration directory included to support +# ALTER SYSTEM statement. +# +# WARNING: User should not remove below include_dir or directory config, +# as both are required to make ALTER SYSTEM command work. +# Any configuration parameter values specified after this line +# will override the values set by ALTER SYSTEM statement. +#include_dir = 'config' Why do we need to expose this setting to a user? a) This can be a knob to turn this feature off. This has been asked by few people, one of the mail link is mentioned below (refer towards end of mail in the link): http://www.postgresql.org/message-id/515b04f9.30...@gmx.net b) In case user wants to change priority of parameters set by Alter System, he can move the include_dir up or down in postgresql.conf. As the warning says, if a user *should not* remove this setting, I think we should not expose it from the beginning and should always treat the postgresql.auto.conf as included configuration file even if that setting is not in postgresql.conf. I think it will be usefull to keep include_dir for some users. If you think that above use for keeping include_dir is not worth or it can be provided in some another way, then we can change this behavior and remove parsing of postgresql.conf from AlterSystemSetConfigFile() function. Thank you very much for reviewing this patch so carefully and giving valuable feedback. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument
Hello Satoshi, I assigned myself for the reviewer of this patch. Issue status is waiting on author. Now looking at the discussion under the thread it seems like we are waiting for the suggestion for the new function name, right ? I am wondering why actually we need new name ? Can't we just overload the same function and provide two version of the functions ? In the last thread Fujii just did the same for pg_relpages and it seems like an good to go approach, isn't it ? Am I missing anything here ? Regards, On Thu, Jul 4, 2013 at 12:28 AM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Jun 26, 2013 at 12:39 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jun 20, 2013 at 2:32 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jun 20, 2013 at 11:43 AM, Satoshi Nagayasu sn...@uptime.jp wrote: (2013/06/17 4:02), Fujii Masao wrote: On Sat, Mar 9, 2013 at 3:23 PM, Satoshi Nagayasu sn...@uptime.jp wrote: It is obviously easy to keep two types of function interfaces, one with regclass-type and another with text-type, in the next release for backward-compatibility like below: pgstattuple(regclass) -- safer interface. pgstattuple(text) -- will be depreciated in the future release. So you're thinking to remove pgstattuple(oid) soon? AFAIK, a regclass type argument would accept an OID value, which means regclass type has upper-compatibility against oid type. So, even if the declaration is changed, compatibility could be kept actually. This test case (in sql/pgstattuple.sql) confirms that. select * from pgstatindex('myschema.test_pkey'::regclass::oid); version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation -+++---+++-+---+--+ 2 | 0 | 8192 | 1 | 0 | 1 | 0 | 0 | 0.79 |0 (1 row) Having both interfaces for a while would allow users to have enough time to rewrite their applications. Then, we will be able to obsolete (or just drop) old interfaces in the future release, maybe 9.4 or 9.5. I think this approach would minimize an impact of such interface change. So, I think we can clean up function arguments in the pgstattuple module, and also we can have two interfaces, both regclass and text, for the next release. Any comments? In the document, you should mark old functions as deprecated. I'm still considering changing the function name as Tom pointed out. How about pgstatbtindex? Or just adding pgstatindex(regclass)? In fact, pgstatindex does support only BTree index. So, pgstatbtindex seems to be more appropriate for this function. Can most ordinary users realize bt means btree? We can keep having both (old) pgstatindex and (new) pgstatbtindex during next 2-3 major releases, and the old one will be deprecated after that. Since pg_relpages(oid) doesn't exist, pg_relpages() is in the same situation as pgstatindex(), i.e., we cannot just replace pg_relpages(text) with pg_relpages(regclass) for the backward-compatibility. How do you think we should solve the pg_relpages() problem? Rename? Just add pg_relpages(regclass)? Adding a function with a new name seems likely to be smoother, since that way you don't have to worry about problems with function calls being thought ambiguous. Could you let me know the example that this problem happens? For the test, I just implemented the regclass-version of pg_relpages() (patch attached) and tested some cases. But I could not get that problem. SELECT pg_relpages('hoge');-- OK SELECT pg_relpages(oid) FROM pg_class WHERE relname = 'hoge';-- OK SELECT pg_relpages(relname) FROM pg_class WHERE relname = 'hoge'; -- OK Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Rushabh Lathia