Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)
On 03/19/2017 07:35 PM, Stephen Frost wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frostwrites: (Or in other words, we've been getting along fine with these script names for circa twenty years, so what's the rush to change them RIGHT NOW?) To be clear, I'm not in any particular rush to change them 'RIGHT NOW'. I tend to agree with Magnus that we're doing a lot of other things in PG10 and that makes it a bit of a natural point, but I don't hold that position terribly strongly. On the other hand, I do not relish the idea of providing backwards-compatibility for every user-facing change we do for 5 years and that's where I feel this approach is encouraging us to go. I only think that argument is only applicable where the changes are closely related, e.g. renaming pg_clog, pg_xlog and pg_log in the same release. I do not see any strong connection between createuser and pg_xlog. As for if we should have backwards compatibility for the old names I am leaning weakly for providing it in the case of createuser. I can see end users being pissed off that the createuser command is suddenly gone without any warning when they upgrade. On the flip side I have no idea how much work it would be to maintain those legacy names. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ICU integration
On 03/15/2017 05:33 PM, Peter Eisentraut wrote: Updated patch attached. Ok, I applied to patch again and ran the tests. I get a test failure on make check-world in the pg_dump tests but it can be fixed with the below. diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 3cac4a9ae0..7d9c90363b 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -2422,7 +2422,7 @@ qr/^\QINSERT INTO test_fifth_table (col1, col2, col3, col4, col5) VALUES (NULL, 'CREATE COLLATION test0 FROM "C";', regexp => qr/^ - \QCREATE COLLATION test0 (lc_collate = 'C', lc_ctype = 'C');\E/xm, + \QCREATE COLLATION test0 (provider = 'libc', locale = 'C');\E/xm, like => { binary_upgrade => 1, clean=> 1, - I do not like how it is not obvious which is the default version of every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not "sv_standard%icu" as one might expect. Is this inherent to ICU or something we can work around? We get these keywords from ucol_getKeywordValuesForLocale(), which says "Given a key and a locale, returns an array of string values in a preferred order that would make a difference." So all those are supposedly different from each other. I believe you are mistaken. The locale "sv" is just an alias for "sv-u-standard" as far as I can tell. See the definition of the Swedish locale (http://source.icu-project.org/repos/icu/trunk/icu4c/source/data/coll/sv.txt) and there are just three collations: reformed (default), standard, search (plus eot and emoji which are inherited). I am also quite annoyed at col-emoji and col-eor (European Ordering Rules). They are defined at the root and inherited by all languages, but no language modifies col-emoji for their needs which makes it a foot gun. See the Danish sorting example below where at least I expected the same order. For col-eor it makes a bit more sense since I would expect the locales en_GB-u-col-eot and sv_SE-u-col-eot to collate exactly the same. # SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE "da-x-icu"; x a k aa (3 rows) # SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE "da-u-co-emoji-x-icu"; x a aa k (3 rows) It seems ICU has made quite the mess here, and I am not sure to what degree we need to handle it to avoid our users getting confused. I need to think some of it, and would love input from others. Maybe the right thing to do is to ignore the issue with col-emoji, but I would want to do something about the default collations. - ICU talks about a new syntax for locale extensions (old: de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page http://userguide.icu-project.org/collation/api. Is this something we need to car about? It looks like we currently use the old format, and while I personally prefer it I am not sure we should rely on an old syntax. Interesting. I hadn't see this before, and the documentation is sparse. But it seems that this refers to BCP 47 language tags, which seem like a good idea. So what I have done is change all the predefined ICU collations to be named after the BCP 47 scheme, with a "private use" -x-icu suffix (instead of %icu). The preserves the original idea but uses a standard naming scheme. I'm not terribly worried that they are going to remove the "old" locale naming, but just to be forward looking, I have changed it so that the collcollate entries are made using the "new" naming for ICU >=54. Sounds good. - I get an error when creating a new locale. #CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Success # CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Resource temporarily unavailable Time: 1.109 ms Hmm, that's pretty straightforward code. What is your operating system? What are the build options? Does it work without this patch? This issue is unrelated to ICU. I had forgot to specify provider so the eorrs are correct (even though that the value of the errno is weird). - For the collprovider is it really correct to say that 'd' is the default value as it does in catalogs.sgml? It doesn't say it's the default value, it says it uses the database default. This is all a bit confusing. We have a collation object named "default", which uses the locale set for the database. That's been that way for a while. Now when introducing the collation providers, that "default" collation gets its own collprovider category 'd'. That is not really used anywhere, but we have to put something there. Ah, I see now. Hm, that is a bit awkward but I cannot think of a cleaner alternative. - I do not know what the policy for formatting the documentation is, but some of the paragraphs are in need of re-wrapping. Hmm, I don't see anything terribly bad. Maybe it is just me
Re: [HACKERS] WIP: Faster Expression Processing v4
Hi, I got a test failure with this version of the patch in the postges_fdw. It looks to me like it was caused by a typo in the source code which is fixed in the attached patch. After applying this patch check-world passes. Andreas diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c index 1763be5cf0..2ba5a2ea69 100644 --- a/src/backend/executor/nodeTidscan.c +++ b/src/backend/executor/nodeTidscan.c @@ -42,7 +42,6 @@ static void TidListCreate(TidScanState *tidstate); static int itemptr_comparator(const void *a, const void *b); static TupleTableSlot *TidNext(TidScanState *node); - /* * Compute the list of TIDs to be visited, by evaluating the expressions * for them. @@ -101,6 +100,7 @@ TidListCreate(TidScanState *tidstate) else if (IsCTIDVar(arg2)) exprstate = ExecInitExpr((Expr *) linitial(fex->args), >ss.ps); + else elog(ERROR, "could not identify CTID variable"); itemptr = (ItemPointer) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \h tab-completion
On 03/01/2017 02:47 PM, Peter Eisentraut wrote: Instead of creating another copy of list_ALTER, let's use the words_after_create list and write a version of create_command_generator/drop_command_generator. Good idea. Here is a patch with that. Andreas commit 7d691929f5814da87bb8a532e7dcfa2bd1dc9f15 Author: Andreas Karlsson <andr...@proxel.se> Date: Fri Feb 3 13:05:48 2017 +0100 Add compleition for \help DROP|ALTER diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index e8458e939e..3df7636c5b 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -982,10 +982,11 @@ typedef struct #define THING_NO_CREATE (1 << 0) /* should not show up after CREATE */ #define THING_NO_DROP (1 << 1) /* should not show up after DROP */ -#define THING_NO_SHOW (THING_NO_CREATE | THING_NO_DROP) +#define THING_NO_ALTER (1 << 2) /* should not show up after ALTER */ +#define THING_NO_SHOW (THING_NO_CREATE | THING_NO_DROP | THING_NO_ALTER) static const pgsql_thing_t words_after_create[] = { - {"ACCESS METHOD", NULL, NULL}, + {"ACCESS METHOD", NULL, NULL, THING_NO_ALTER}, {"AGGREGATE", NULL, _for_list_of_aggregates}, {"CAST", NULL, NULL}, /* Casts have complex structures for names, so * skip it */ @@ -999,6 +1000,7 @@ static const pgsql_thing_t words_after_create[] = { {"CONVERSION", "SELECT pg_catalog.quote_ident(conname) FROM pg_catalog.pg_conversion WHERE substring(pg_catalog.quote_ident(conname),1,%d)='%s'"}, {"DATABASE", Query_for_list_of_databases}, {"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW}, + {"DEFAULT PRIVILEGES", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"DOMAIN", NULL, _for_list_of_domains}, {"EVENT TRIGGER", NULL, NULL}, {"EXTENSION", Query_for_list_of_extensions}, @@ -1006,12 +1008,13 @@ static const pgsql_thing_t words_after_create[] = { {"FOREIGN TABLE", NULL, NULL}, {"FUNCTION", NULL, _for_list_of_functions}, {"GROUP", Query_for_list_of_roles}, - {"LANGUAGE", Query_for_list_of_languages}, {"INDEX", NULL, _for_list_of_indexes}, + {"LANGUAGE", Query_for_list_of_languages, NULL, THING_NO_ALTER}, + {"LARGE OBJECT", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"MATERIALIZED VIEW", NULL, _for_list_of_matviews}, {"OPERATOR", NULL, NULL}, /* Querying for this is probably not such a * good idea. */ - {"OWNED", NULL, NULL, THING_NO_CREATE}, /* for DROP OWNED BY ... */ + {"OWNED", NULL, NULL, THING_NO_CREATE | THING_NO_ALTER}, /* for DROP OWNED BY ... */ {"PARSER", Query_for_list_of_ts_parsers, NULL, THING_NO_SHOW}, {"POLICY", NULL, NULL}, {"PUBLICATION", Query_for_list_of_publications}, @@ -1021,15 +1024,16 @@ static const pgsql_thing_t words_after_create[] = { {"SEQUENCE", NULL, _for_list_of_sequences}, {"SERVER", Query_for_list_of_servers}, {"SUBSCRIPTION", Query_for_list_of_subscriptions}, + {"SYSTEM", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"TABLE", NULL, _for_list_of_tables}, {"TABLESPACE", Query_for_list_of_tablespaces}, - {"TEMP", NULL, NULL, THING_NO_DROP}, /* for CREATE TEMP TABLE ... */ + {"TEMP", NULL, NULL, THING_NO_DROP | THING_NO_ALTER}, /* for CREATE TEMP TABLE ... */ {"TEMPLATE", Query_for_list_of_ts_templates, NULL, THING_NO_SHOW}, {"TEXT SEARCH", NULL, NULL}, {"TRIGGER", "SELECT pg_catalog.quote_ident(tgname) FROM pg_catalog.pg_trigger WHERE substring(pg_catalog.quote_ident(tgname),1,%d)='%s' AND NOT tgisinternal"}, {"TYPE", NULL, _for_list_of_datatypes}, - {"UNIQUE", NULL, NULL, THING_NO_DROP}, /* for CREATE UNIQUE INDEX ... */ - {"UNLOGGED", NULL, NULL, THING_NO_DROP}, /* for CREATE UNLOGGED TABLE + {"UNIQUE", NULL, NULL, THING_NO_DROP | THING_NO_ALTER}, /* for CREATE UNIQUE INDEX ... */ + {"UNLOGGED", NULL, NULL, THING_NO_DROP | THING_NO_ALTER}, /* for CREATE UNLOGGED TABLE * ... */ {"USER", Query_for_list_of_roles}, {"USER MAPPING FOR", NULL, NULL}, @@ -1042,6 +1046,7 @@ static const pgsql_thing_t words_after_create[] = { static char **psql_completion(const char *text, int start, int end); static char *create_command_generator(const char *text, int state); static char *drop_command_generator(const char *text, int state); +static char *alter_command_generator(const char *text, int state); static char *complete_from_query(const char *text, int state); static char *complete_from_schema_query(const char *text, int state); static char *_complete_from_query(int is_schema_query, @@ -1316,6 +1321,17 @@ psql_completion(const char *text, i
Re: [HACKERS] ICU integration
On 03/09/2017 10:13 PM, Peter Eisentraut wrote: - Naming of collations: Are we happy with the "de%icu" naming? I might have come up with that while reviewing the IPv6 zone index patch. ;-) An alternative might be "de$icu" for more Oracle vibe and avoiding the need for double quotes in some cases. (But we have mixed-case names like "de_AT%icu", so further changes would be necessary to fully get rid of the need for quoting.) A more radical alternative would be to install ICU locales in a different schema and use the search_path, or even have a separate search path setting for collations only. Which leads to ... I do not like the schema based solution since search_path already gives us enough headaches. As for the naming I am fine with the current scheme. Would be nice with something we did not have to quote, but I do not like using dollar signs since they are already use for other things. - Selecting default collation provider: Maybe we want a setting, say in initdb, to determine which provider's collations get the "good" names? Maybe not necessary for this release, but something to think about. This does not seem necessary for the initial release. - Currently (in this patch), we check a collation's version when it is first used. But, say, after pg_upgrade, you might want to check all of them right away. What might be a good interface for that? (Possibly, we only have to check the ones actually in use, and we have dependency information for that.) How about adding a SQL level function for checking this which can be called by pg_upgrade? = Review Here is an initial review. I will try to find some time to do more testing later this week. This is a really useful feature given the poor quality of collation support libc. Just that ICU versions the encodings is huge, and the larger range of available collations with high configurability. Additionally being able to use abbreviated keys again would be huge. ICU also supports writing your own collations and modifying existing ones, something which might be possible to expose in the future. In general ICU offers a lot for users who care about the details of text collation. == Functional - Generally things seem to work fine and as expected. - I get a test failures in the default test suite due to not having the tr_TR locale installed. I would assume that this would be pretty common for hackers. *** *** 428,443 -- to_char SET lc_time TO 'tr_TR'; SELECT to_char(date '2010-04-01', 'DD TMMON '); to_char - ! 01 NIS 2010 (1 row) SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr%icu"); to_char - ! 01 NİS 2010 (1 row) -- backwards parsing --- 428,444 -- to_char SET lc_time TO 'tr_TR'; + ERROR: invalid value for parameter "lc_time": "tr_TR" SELECT to_char(date '2010-04-01', 'DD TMMON '); to_char - ! 01 APR 2010 (1 row) SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr%icu"); to_char - ! 01 APR 2010 (1 row) -- backwards parsing == - The code no longer compiles without HAVE_LOCALE_T. - I do not like how it is not obvious which is the default version of every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not "sv_standard%icu" as one might expect. Is this inherent to ICU or something we can work around? - ICU talks about a new syntax for locale extensions (old: de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page http://userguide.icu-project.org/collation/api. Is this something we need to car about? It looks like we currently use the old format, and while I personally prefer it I am not sure we should rely on an old syntax. - I get an error when creating a new locale. #CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Success # CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Resource temporarily unavailable Time: 1.109 ms == Code review - For the collprovider is it really correct to say that 'd' is the default value as it does in catalogs.sgml? - I do not know what the policy for formatting the documentation is, but some of the paragraphs are in need of re-wrapping. - Add a hint to "ERROR: conflicting or redundant options". The error message is pretty unclear. - I am not a fan of this patch comparing the encoding with a -1 literal. How about adding -1 as a value to the enum? See the example below for a place where the patch compares with -1. ereport(NOTICE, (errcode(ERRCODE_DUPLICATE_OBJECT), -errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping", - collname, pg_encoding_to_char(collencoding; +collencoding == -1 +? errmsg("collation \"%s\" already exists, skipping", +
Re: [HACKERS] \h tab-completion
On 03/17/2017 12:01 AM, Peter Eisentraut wrote: Committed with some tweaking. Thanks! Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 04/03/2017 07:57 AM, Michael Paquier wrote: On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andr...@proxel.se> wrote: On 03/31/2017 08:27 AM, Michael Paquier wrote: - Do a per-index rebuild and not a per-relation rebuild for concurrent indexing. Doing a per-relation reindex has the disadvantage that many objects need to be created at the same time, and in the case of REINDEX CONCURRENTLY time of the operation is not what matters, it is how intrusive the operation is. Relations with many indexes would also result in much object locks taken at each step. I am personally worried about the amount time spent waiting for long running transactions if you reindex per index rather than per relation. Because when you for one index wait on long running transactions nothing prevents new long transaction from starting, which we will have to wait for while reindexing the next index. If your database has many long running transactions more time will be spent waiting than the time spent working. Yup, I am not saying that one approach or the other are bad, both are worth considering. That's a deal between waiting and manual potential cleanup in the event of a failure. Agreed, and which is worse probably depends heavily on your schema and workload. I am marking this patch as returned with feedback, this won't get in PG10. If I am freed from the SCRAM-related open items I'll try to give another shot at implementing this feature before the first CF of PG11. Thanks! I also think I will have time to work on this before the first CF. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/12/2017 04:12 PM, Tom Lane wrote: 1. The best thing would still be to make genbki.pl do the conversion, and write numeric OIDs into postgres.bki. The core stumbling block here seems to be that for most catalogs, Catalog.pm and genbki.pl never really break down a DATA line into fields --- and we certainly have got to do that, if we're going to replace the values of regproc fields. The places that do need to do that approximate it like this: # To construct fmgroids.h and fmgrtab.c, we need to inspect some # of the individual data fields. Just splitting on whitespace # won't work, because some quoted fields might contain internal # whitespace. We handle this by folding them all to a simple # "xxx". Fortunately, this script doesn't need to look at any # fields that might need quoting, so this simple hack is # sufficient. $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; @{$row}{@attnames} = split /\s+/, $row->{bki_values}; We would need a bullet-proof, non-hack, preferably not too slow way to split DATA lines into fields properly. I'm one of the world's worst Perl programmers, but surely there's a way? 2. Alternatively, we could teach bootstrap mode to build a hashtable mapping proname to OID while it reads pg_proc data from postgres.bki, and then make regprocin's bootstrap path consult the hashtable instead of looking directly at the pg_proc file. That I'm quite sure is do-able, but it seems like it's leaving money on the table compared to doing the transformation earlier. Thoughts? Looked at this an option 1 seems simple enough if I am not missing something. I might hack something up later tonight. Either way I think this improvement can be done separately from the proposed replacement of the catalog header files. Trying to fix everything at once often leads to nothing being fixed at all. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/12/2017 05:00 PM, Andreas Karlsson wrote: Looked at this an option 1 seems simple enough if I am not missing something. I might hack something up later tonight. Either way I think this improvement can be done separately from the proposed replacement of the catalog header files. Trying to fix everything at once often leads to nothing being fixed at all. Here is my proof of concept patch. It does basically the same thing as Andres's patch except that it handles quoted values a bit better and does not try to support anything other than the regproc type. The patch speeds up initdb without fsync from 0.80 seconds to 0.55 seconds, which is a nice speedup, while adding a negligible amount of extra work on compilation. Two things which could be improved in this patch if people deem it important: - Refactor the code to be more generic, like Andres patch, so it is easier to add lookups for other data types. - Write something closer to a real .bki parser to actually understand quoted values and _null_ when parsing the data. Right now this patch only splits each row into its fields (while being aware of quotes) but does not attempt to remove quotes. The PoC patch treats "foo" and foo as different. Andreas commit bc8ede661b12407c0b6a9e4c0932d21028f87fc1 Author: Andreas Karlsson <andr...@proxel.se> Date: Wed Apr 12 21:00:49 2017 +0200 WIP: Resolve regproc entires when generating .bki diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 6e9d57aa8d..f918c9ef8a 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -102,6 +102,7 @@ print $bki "# PostgreSQL $major_version\n"; # vars to hold data needed for schemapg.h my %schemapg_entries; my @tables_needing_macros; +my %procs; our @types; # produce output, one catalog at a time @@ -164,20 +165,41 @@ foreach my $catname (@{ $catalogs->{names} }) $row->{bki_values} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; $row->{bki_values} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; + # Split values into tokens without interpreting their meaning. + my %bki_values; + @bki_values{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g; + + # Substitute regproc entires with oids + foreach my $att (keys %bki_values) + { +next if $bki_attr{$att}->{type} ne 'regproc'; +next if $bki_values{$att} =~ /\A(\d+|-|_null_)\z/; + +$bki_values{$att} = $procs{$bki_values{$att}}; + } + + # Save pg_proc oids for use by later catalogs. This relies on the + # order we process the files, but the bootstrap code also relies on + # pg_proc being loaded first. + if ($catname eq 'pg_proc') + { +$procs{$bki_values{proname}} = $row->{oid}; + } + # Save pg_type info for pg_attribute processing below if ($catname eq 'pg_type') { -my %type; +my %type = %bki_values; $type{oid} = $row->{oid}; -@type{@attnames} = split /\s+/, $row->{bki_values}; push @types, \%type; } # Write to postgres.bki my $oid = $row->{oid} ? "OID = $row->{oid} " : ''; - printf $bki "insert %s( %s)\n", $oid, $row->{bki_values}; + printf $bki "insert %s( %s)\n", $oid, + join(' ', @bki_values{@attnames}); - # Write comments to postgres.description and postgres.shdescription + # Write comments to postgres.description and postgres.shdescription if (defined $row->{descr}) { printf $descr "%s\t%s\t0\t%s\n", $row->{oid}, $catname, diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index 2af9b355e7..10d9c6abc7 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -58,16 +58,9 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} }) my $data = $catalogs->{pg_proc}->{data}; foreach my $row (@$data) { - - # To construct fmgroids.h and fmgrtab.c, we need to inspect some - # of the individual data fields. Just splitting on whitespace - # won't work, because some quoted fields might contain internal - # whitespace. We handle this by folding them all to a simple - # "xxx". Fortunately, this script doesn't need to look at any - # fields that might need quoting, so this simple hack is - # sufficient. - $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; - @{$row}{@attnames} = split /\s+/, $row->{bki_values}; + # Split values into tokens without interpreting their meaning. + # This is safe becease we are going to use the values as-is. + @{$row}{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g; # Select out just the rows for internal-language procedures. # Note assumption here that INTERNALlanguageId is 12. diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c index 702924a958..a4237b0661 100644 --- a/src/backend/utils/adt/regproc.c +++ b/src/backend/utils/adt/regproc.c @@
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/14/2017 11:54 PM, Tom Lane wrote: I failed to resist the temptation to poke at this, and found that indeed nothing seems to break if we just use one transaction for the whole processing of postgres.bki. So I've pushed a patch that does that. We're definitely down to the point where worrying about the speed of bootstrap mode, per se, is useless; the other steps in initdb visibly take a lot more time. Looked some at this and what take time now for me seems to mainly be these four things (out of a total runtime of 560 ms). 1. setup_conversion:140 ms 2. select_default_timezone: 90 ms 3. bootstrap_template1: 80 ms 4. setup_schema: 65 ms These four take up about two thirds of the total runtime, so it seems likely that we may still have relatively low hanging fruit (but not worth committing for PostgreSQL 10). I have not done profiling of these functions yet, so am not sure how they best would be fixed but maybe setup_conversion could be converted into bki entries to speed it up. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 04/16/2017 03:14 AM, Tom Lane wrote: 1. Back-patch that patch, probably also including the followup adjustments in 86029b31e and 36a3be654. 2. Add #if's to use 31cf1a1a4's coding with OpenSSL >= 1.1, while keeping the older code for use when built against older OpenSSLs. 3. Conditionally disable renegotiation altogether with OpenSSL >= 1.1, thus adopting 9.5 not 9.4 behavior when using newer OpenSSL. [...] Thoughts? Given that I cannot recall seeing any complaints about the behavior of 9.4 compared to 9.3 I am leaning towards #1. That way there are fewer different versions of our OpenSSL code. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rename pg_log directory?
On 03/09/2017 11:25 PM, Bruce Momjian wrote: "data" and "base" where chosen because it is a "data-base", but with the pg_ prefixes it would be a pg_data_pg_base. ;-) Haha, I had not spotted that one despite always naming my data directory "data" while developing. Fun little tidbit there. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/08/2017 03:48 AM, Robert Haas wrote: On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andr...@proxel.se> wrote: And I would argue that his feature is useful for quite many, based on my experience running a semi-large database. Index bloat happens and without REINDEX CONCURRENTLY it can be really annoying to solve, especially for primary keys. Certainly more people have problems with index bloat than the number of people who store index oids in their database. Yeah, but that's not the only wart, I think. The only two potential issues I see with the patch are: 1) That the index oid changes visibly to external users. 2) That the code for moving the dependencies will need to be updated when adding new things which refer to an index oid. Given how useful I find REINDEX CONCURRENTLY I think these warts are worth it given that the impact is quite limited. I am of course biased since if I did not believe this I would not pursue this solution in the first place. For example, I believe (haven't looked at this patch series in a while) that the patch takes a lock and later escalates the lock level. If so, that could lead to doing a lot of work to build the index and then getting killed by the deadlock detector. This version of the patch no longer does that. For my use case escalating the lock would make this patch much less interesting. The highest lock level taken is the same one as the initial one (SHARE UPDATE EXCLUSIVE). The current patch does on a high level (very simplified) this: 1. CREATE INDEX CONCURRENTLY ind_new; 2. Atomically move all dependencies from ind to ind_new, rename ind to ind_old, and rename ind_new to ind. 3. DROP INDEX CONCURRENTLY ind_old; The actual implementation is a bit more complicated in reality, but no part escalates the lock level over what would be required by the steps for creating and dropping indexes concurrently Also, if by any chance you think (or use any software that thinks) that OIDs for system objects are a stable identifier, this will be the first case where that ceases to be true. If the system is shut down or crashes or the session is killed, you'll be left with stray objects with names that you've never typed into the system. I'm sure you're going to say "don't worry, none of that is any big deal" and maybe you're right. Hm, I cannot think of any real life scenario where this will be an issue based on my personal experience with PostgreSQL, but if you can think of one please provide it. I will try to ponder some more on this myself. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] adding an immutable variant of to_date
On 03/07/2017 09:56 PM, Sven R. Kunze wrote: On 07.03.2017 03:21, Andreas Karlsson wrote: 1) I do not think we currently allow setting the locale like this anywhere, so this will introduce a new concept to PostgreSQL. And you will probably need to add support for caching per locale. Good to know. Could you explain what you mean by "caching per locale"? The current code for to_char will on the first call to to_char build arrays with the localized names of the week days and the months. I suspect that you may need to build something similar but a set of arrays per locale. See the DCH_to_char function and its call to cache_locale_time. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/13/2017 03:11 AM, Andreas Karlsson wrote: I also fixed the the code to properly support triggers. And by "support triggers" I actually meant fixing the support for moving the foreign keys to the new index. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/02/2017 03:10 AM, Michael Paquier wrote: On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andr...@proxel.se> wrote: +/* + * Copy contraint flags for old index. This is safe because the old index + * guaranteed uniquness. + */ +newIndexForm->indisprimary = oldIndexForm->indisprimary; +oldIndexForm->indisprimary = false; +newIndexForm->indisexclusion = oldIndexForm->indisexclusion; +oldIndexForm->indisexclusion = false; [...] +deleteDependencyRecordsForClass(RelationRelationId, newIndexOid, +RelationRelationId, DEPENDENCY_AUTO); +deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid, +ConstraintRelationId, DEPENDENCY_INTERNAL); + +// TODO: pg_depend for old index? Spotted one of my TODO comments there so I have attached a patch where I have cleaned up that function. I also fixed the the code to properly support triggers. There is a lot of mumbo-jumbo in the patch to create the exact same index definition as the original one being reindexed, and that's a huge maintenance burden for the future. You can blame me for that in the current patch. I am wondering if it would not just be better to generate a CREATE INDEX query string and then use the SPI to create the index, and also do the following extensions at SQL level: - Add a sort of WITH NO DATA clause where the index is created, so the index is created empty, and is marked invalid and not ready. - Extend pg_get_indexdef_string() with an optional parameter to enforce the index name to something else, most likely it should be extended with the WITH NO DATA/INVALID clause, which should just be a storage parameter by the way. By doing something like that what the REINDEX CONCURRENTLY code path should just be careful about is that it chooses an index name that avoids any conflicts. Hm, I am not sure how much that would help since a lot of the mumb-jumbo is by necessity in the step where we move the constraints over from the old index to the new. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index 306def4a15..ca1aeca65f 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -923,7 +923,8 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), - ANALYZE, CREATE INDEX CONCURRENTLY, and + ANALYZE, CREATE INDEX CONCURRENTLY, + REINDEX CONCURRENTLY, ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ). diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 3908ade37b..3449c0af73 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + REINDEX will perform a concurrent build if + CONCURRENTLY is specified. To build the index without interfering + with production you should drop the index and reissue either the + CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + command. Indexes of toast relations can be rebuilt with REINDEX + CONCURRENTLY. @@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + + VERBOSE @@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + + + +Rebuilding an index can interfere with regular operation of a database. +Normally PostgreSQL locks the table whose index is rebuilt +against writes and performs the entire index build with a single scan of the +table. Other transactions can still read the table, but if they try to +insert, update, or delete rows in the table they will block unti
Re: [HACKERS] \h tab-completion
On 03/13/2017 03:56 PM, David Steele wrote: Do you know when you will have a new patch available for review that incorporates Peter's request? I believe I will find the time to finish it some time in a couple of days. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rename pg_log directory?
On 03/01/2017 05:49 AM, Peter Eisentraut wrote: On 2/27/17 09:51, Tom Lane wrote: No objection to the basic point, but "log" seems perhaps a little too generic to me. Would something like "server_log" be better? Well, "log" is pretty well established. There is /var/log, and if you unpack a, say, Kafka or Cassandra distribution, they also come with a log or logs directory. +1, though I am also fine with server_log. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rename pg_log directory?
On 02/27/2017 03:05 PM, Peter Eisentraut wrote: How about changing the default for log_directory from 'pg_log' to, say, 'log'? I have attached a patch which does this. I do not care much about which name is picked as long as we get rid off the "pg_" prefix. Btw, is there a reason for why global and base do not have the "pg_" prefix? Andreas commit 0b71fcdb328f05349775675e0491ba1b82127d4e Author: Andreas Karlsson <andr...@proxel.se> Date: Mon Mar 6 23:52:49 2017 +0100 Rename default log directory from pg_log to log diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index cd82c04b05..4ee1f605bc 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4324,8 +4324,8 @@ SELECT * FROM parent WHERE key = 2400; find the logs currently in use by the instance. Here is an example of this file's content: -stderr pg_log/postgresql.log -csvlog pg_log/postgresql.csv +stderr log/postgresql.log +csvlog log/postgresql.csv current_logfiles is recreated when a new log file @@ -4427,7 +4427,7 @@ local0.*/var/log/postgresql cluster data directory. This parameter can only be set in the postgresql.conf file or on the server command line. -The default is pg_log. +The default is log. diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml index 309a303e03..359f222352 100644 --- a/doc/src/sgml/file-fdw.sgml +++ b/doc/src/sgml/file-fdw.sgml @@ -262,7 +262,7 @@ CREATE FOREIGN TABLE pglog ( location text, application_name text ) SERVER pglog -OPTIONS ( filename '/home/josh/9.1/data/pg_log/pglog.csv', format 'csv' ); +OPTIONS ( filename '/home/josh/9.1/data/log/pglog.csv', format 'csv' ); diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 0707f66631..69b9cdacff 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3298,7 +3298,7 @@ static struct config_string ConfigureNamesString[] = GUC_SUPERUSER_ONLY }, _directory, - "pg_log", + "log", check_canonical_path, NULL, NULL }, { diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 157d775853..e6dbc31591 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -343,7 +343,7 @@ # (change requires restart) # These are only used if logging_collector is on: -#log_directory = 'pg_log' # directory where log files are written, +#log_directory = 'log' # directory where log files are written, # can be absolute or relative to PGDATA #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern, # can include strftime() escapes diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm index e5cb348f4c..1c87e39e0d 100644 --- a/src/test/perl/PostgresNode.pm +++ b/src/test/perl/PostgresNode.pm @@ -560,7 +560,7 @@ sub _backup_fs $backup_path, filterfn => sub { my $src = shift; - return ($src ne 'pg_log' and $src ne 'postmaster.pid'); + return ($src ne 'log' and $src ne 'postmaster.pid'); }); if ($hot) diff --git a/src/test/perl/RecursiveCopy.pm b/src/test/perl/RecursiveCopy.pm index 3e98813286..457488ba5b 100644 --- a/src/test/perl/RecursiveCopy.pm +++ b/src/test/perl/RecursiveCopy.pm @@ -48,9 +48,9 @@ attempted. RecursiveCopy::copypath('/some/path', '/empty/dir', filterfn => sub { - # omit pg_log and contents + # omit log and contents my $src = shift; - return $src ne 'pg_log'; + return $src ne 'log'; } ); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] adding an immutable variant of to_date
On 03/03/2017 10:41 PM, Sven R. Kunze wrote: What do you think? I have some thoughts: 1) I do not think we currently allow setting the locale like this anywhere, so this will introduce a new concept to PostgreSQL. And you will probably need to add support for caching per locale. 2) As far as I can tell from reading the code to_date currently ignores the M suffix which indicates that you want localized month/day names, so i guess that to_date is currently immutable. Maybe it is stable due to the idea that we may want to support the M suffix in the future. 3) I do not like the to_date function. It is much too forgiving with invalid input. For example 2017-02-30 because 2017-03-02. Also just ignoring the M suffix in the format string seems pretty bad Personally I would rather see a new date parsing function which is easier to work with or somehow fix to_date without pissing too many users off, but I have no idea if this is a view shared with the rest of the community. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/05/2017 07:56 PM, Robert Haas wrote: On Sat, Mar 4, 2017 at 12:34 PM, Andres Freundwrote: I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart. I, again, give that a firm "maybe". If the warts end up annoying 1% of the users who try to use this feature, then you're right. If they end up making a substantial percentage of people who try to use this feature give up on it, then we've added a bunch of complexity and future code maintenance for little real gain. I'm not ruling out the possibility that you're 100% correct, but I'm not nearly as convinced of that as you are. I agree that these warts are annoying but I also think the limitations can be explained pretty easily to users (e.g. by explaining it in the manual how REINDEX CONCURRENTLY creates a new index to replace the old one), and I think that is the important question about if a useful feature with warts should be merged or not. Does it make things substantially harder for the average user to grok? And I would argue that his feature is useful for quite many, based on my experience running a semi-large database. Index bloat happens and without REINDEX CONCURRENTLY it can be really annoying to solve, especially for primary keys. Certainly more people have problems with index bloat than the number of people who store index oids in their database. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/02/2017 02:25 AM, Jim Nasby wrote: On 2/28/17 11:21 AM, Andreas Karlsson wrote: The only downside I can see to this approach is that we no logner will able to reindex catalog tables concurrently, but in return it should be easier to confirm that this approach can be made work. Another downside is any stored regclass fields will become invalid. Admittedly that's a pretty unusual use case, but it'd be nice if there was at least a way to let users fix things during the rename phase (perhaps via an event trigger). Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY issue event triggers seems strange to me. While it does create and drop indexes as part of its implementation, it is actually just an index maintenance job. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/13/2017 06:13 PM, Tom Lane wrote: I've pushed this with some mostly-cosmetic adjustments: Thanks! I like your adjustments. There's certainly lots more that could be done in the genbki code, but I think all we can justify at this stage of the development cycle is to get the low-hanging fruit for testing speedups. Yeah, I also noticed that the genbki code seems to have gotten little love and that much more can be done here. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Self-signed certificate instructions
On 04/15/2017 03:58 PM, Andrew Dunstan wrote: The instructions on how to create a self-signed certificate in s 18.9.3 of the docs seem unduly cumbersome. +1, I see no reason for us to spread unnecessarily complicated instructions. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG 10 release notes
On 04/25/2017 03:31 AM, Bruce Momjian wrote: I have committed the first draft of the Postgres 10 release notes. They are current as of two days ago, and I will keep them current. Please give me any feedback you have. This item is incorrectly attributed to me. I was only the reviewer, Peter is the author. + + + +Create a linkend="catalog-pg-sequence">pg_sequence system catalog to store sequence metadata (Andreas +Karlsson) + Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?
On 08/04/2017 08:48 PM, Shay Rojansky wrote: On 2017-08-04 07:22:42 +0300, Shay Rojansky wrote: > I'm still not convinced of the risk/problem of simply setting the session > id context as I explained above (rather than disabling the optimization), > but of course either solution resolves my problem. How would that do anything? Each backend has it's own local memory. I.e. any cache state that openssl would maintain wouldn't be useful. If you want to take advantage of features around this you really need to cache tickets in shared memory... Guys, there's no data being cached at the backend - RFC5077 is about packaging information into a client-side opaque session ticket that allows skipping a roundtrip on the next connection. As I said, simply setting the session id context (*not* the session id or anything else) makes this feature work, even though a completely new backend process is launched. Yes, session tickets are encrypted data which is stored by the client. But if we are going to support them I think we should do it properly with new GUCs for the key file and disabling the feature. Using a key file is less necessary for PostgreSQL than for a web server since it is less common to do round robin load balancing between different PostgreSQL instances. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/03/2017 07:33 PM, Alvaro Herrera wrote: 1) we switch unmarked CTEs as inlineable by default in pg11. What seems likely to happen for a user that upgrades to pg11 is that 5 out of 10 CTE-using queries are going to become faster than with pg10, and they are going to be happy; 4 out of five are going to see no difference, but they didn't have to do anything about it; and the remaining query is going to become slower, either indistinguishably so (in which case they don't care and they remain happy because of the other improvements) or notably so, in which case they can easily figure where to add the MATERIALIZED option and regain the original performance. 2) unmarked CTEs continue to be an optimization barrier, but we add "WITH INLINED" so that they're inlineable. Some users may wonder about it and waste a lot of time trying to figure out which CTEs to add it to. They see a benefit in half the queries, which makes them happy, but they are angry that they had to waste all that time on the other queries. 3) We don't do anything, because we all agree that GUCs are not suitable. No progress. No anger, but nobody is happy either. +1 for option 1. And while I would not like if we had to combine it with a backwards compatibility GUC which enables the old behavior to get it merged I still personally would prefer that over option 2 and 3. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oidin / oidout and InvalidOid
On 06/12/2017 01:41 PM, Chapman Flack wrote: I was recently guilty of writing a less-than-clear SQL example because I plain forgot whether InvalidOid was 0 (correct) or -1 (my bad). Would there be any sense in letting oidin_subr accept the string InvalidOid for 0? I understand that changing oidout could break existing code outside of the tree. But what if oidout were to be conservative in what it does, and oidin liberal in what it accepts? I am not sure I am a fan of this, but if we should have an alias for InvalidOid how about reusing '-' which is used by the reg*in functions? Or is that too non-obvious? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/01/2017 04:33 PM, David G. Johnston wrote: > On Mon, May 1, 2017 at 7:26 AM, Andreas Karlsson <andr...@proxel.se > I am not sure I like decorators since this means adding an ad hoc > query hint directly into the SQL syntax which is something which I > requires serious consideration. > > Given that we already have > " > prevent optimization > " > syntax why do we need a decorator on the CTE? I do not think I follow. Me and some other people here would ideally allow CTEs to be inlined by default. Some people today use CTEs as optimization fences, to for example control join order, and the suggestion here is to add new syntax for CTEs to allow them to selectively be used as optimization fences. > I would shorten that to "WITH MAT" except that I don't think that > having two way to introduce an optimization fence is worthwhile. You mean OFFSET 0? I have never been a fan of using it as an optimization fence. I do not think OFFSET 0 conveys clearly enough to the reader that is is an optimization fence. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/01/2017 04:17 PM, David Fetter wrote: Maybe we could allow a "decorator" that would tell the planner the CTE could be inlined? WITH INLINE mycte AS ( ...) +1 for a decorator, -1 for this one. I am not sure I like decorators since this means adding an ad hoc query hint directly into the SQL syntax which is something which I requires serious consideration. We already have an explicit optimization fence with OFFSET 0, and I think making optimization fences explicit is how we should continue. I'd be more in favor of something along the lines of WITH FENCED/* Somewhat fuzzy. What fence? */ or WITH AT_MOST_ONCE /* Clearer, but not super precise */ or WITH UNIQUE_ATOMIC /* More descriptive, but not super clear without the docs in hand */ or something along that line. What about WITH MATERIALIZED, borrowing from the MySQL terminology "materialized subquery"? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/02/2017 04:38 AM, Craig Ringer wrote: On 1 May 2017 at 22:26, Andreas Karlsson <andr...@proxel.se> wrote: I am not sure I like decorators since this means adding an ad hoc query hint directly into the SQL syntax which is something which I requires serious consideration. And mangling the semantics of existing syntax doesn't? That's what we do right now so we can pretend we don't have query hints while still having query hints. I am in favor of removing the optimization fence from CTEs, and strongly prefer no fence being the default behavior since SQL is a declarative language and I think it is reasonable to assume that CTEs can be inlined. But the question is how to best remove the fence while taking into account that quite many use them as optimization fences today. I see some alternatives, none of them perfect. 1. Just remove the optimization fence and let people add OFFSET 0 to their queries if they want an optimization fence. This lets us keep pretending that we do not have query hints (and therefore do not have to formalize any syntax for them) while still allowing people to add optimization fences. 2. Add a decorator for WITH (e.g. "WITH MATERIALIZED x (...") to add an explicit optimization fence. This will for the first time add official support for a query hint in the syntax which is a quite big precedent. 3. Add a new GUC which can enable and disable the optimization fence. This is a very clumsy tool, but maybe good enough for some users and some people here in this thread have complained about our similar GUCs. 4. Add some new more generic query hinting facility. This is a lot of work and something which would be very hard to get consensus for. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/04/2017 06:22 PM, Andrew Dunstan wrote: I wrote this query: select (json_populate_record(null::mytype, myjson)).* from mytable; It turned out that this was an order of magnitude faster: with r as ( select json_populate_record(null::mytype, myjson) as x from mytable ) select (x).* from r; I do not know the planner that well, but I imagined that when we remove the optimization fence that one would be evaluated similar to if it had been a lateral join, i.e. there would be no extra function calls in this case after removing the fence. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw super user checks
On 09/14/2017 08:33 PM, Jeff Janes wrote:> Attached is a new patch which fixes the style issue you mentioned. Thanks, the patch looks good no,w and as far as I can tell there was no need to update the comments or the documentation so I am setting this as ready for committer. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/19/2017 11:32 PM, Peter Geoghegan wrote: On Tue, Sep 19, 2017 at 2:22 PM, Tom Lanewrote: Well, if PG10 shipped with that restriction in place then it wouldn't be an issue ;-) I was proposing that this be treated as an open item for v10; sorry if I was unclear on that. Much like the "ICU locales vs. ICU collations within pg_collation" issue, this seems like the kind of thing that we ought to go out of our way to get right in the *first* version. If people think it is possible to get this done in time for PostgreSQL 10 and it does not break anything on older version of ICU (or the migration from older versions) I am all for it. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/19/2017 12:46 AM, Peter Geoghegan wrote:> At one point a couple of months back, it was understood that get_icu_language_tag() might not always work with (assumed) valid locale names -- that is at least the impression that the commit message of eccead9 left me with. But, that was only with ICU 4.2, and in any case we've since stopped creating keyword variants at initdb time for other reasons (see 2bfd1b1 for details of those other reasons). I tend to think that we should not install any language tag that uloc_toLanguageTag() does not accept as valid on general principle (so not just at initdb time, when it's actually least needed). Thoughts? I can write a patch for this, if that helps. It should be straightforward. Hm, I like the idea but I see some issues. Enforcing the BCP47 seems like a good thing to me. I do not see any reason to allow input with syntax errors. The issue though is that we do not want to break people's databases when they upgrade to PostgreSQL 11. What if they have specified the locale in the old non-ICU format or they have a bogus value and we then error out on pg_upgrade or pg_restore? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/21/2017 10:55 PM, Peter Geoghegan wrote: I agree, but the bigger issue is that we're *half way* between supporting only one format, and supporting two formats. AFAICT, there is no reason that we can't simply support one format on all ICU versions, and keep what ends up within pg_collation at initdb time essentially the same across ICU versions (except for those that are due to cultural/political developments). I think we are in agreement then, but I do not have the time to get this done before the release of 10 so would be happy if you implemented it. Peter E, what do you say in this? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GnuTLS support
On 09/15/2017 06:55 PM, Jeff Janes wrote: I can't build against gnutls-2.12.23-21.el6.x86_64 from CentOS 6.9 Thanks for testing my patch. I have fixed both these issues plus some of the other feedback. A new version of my patch is attached which should, at least on theory, support all GnuTLS versions >= 2.11. I just very quickly fixed the broken SSL tests, as I am no fan of how the SSL tests currently are written and think they should be cleaned up. Andreas diff --git a/configure b/configure index 0d76e5ea42..33b1f00bff 100755 --- a/configure +++ b/configure @@ -709,6 +709,7 @@ UUID_EXTRA_OBJS with_uuid with_systemd with_selinux +with_gnutls with_openssl krb_srvtab with_python @@ -838,6 +839,7 @@ with_bsd_auth with_ldap with_bonjour with_openssl +with_gnutls with_selinux with_systemd with_readline @@ -1534,6 +1536,7 @@ Optional Packages: --with-ldap build with LDAP support --with-bonjour build with Bonjour support --with-openssl build with OpenSSL support + --with-gnutls build with GnuTS support --with-selinux build with SELinux support --with-systemd build with systemd support --without-readline do not use GNU Readline nor BSD Libedit for editing @@ -6051,6 +6054,41 @@ fi $as_echo "$with_openssl" >&6; } +# +# GnuTLS +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5 +$as_echo_n "checking whether to build with GnuTLS support... " >&6; } + + + +# Check whether --with-gnutls was given. +if test "${with_gnutls+set}" = set; then : + withval=$with_gnutls; + case $withval in +yes) + +$as_echo "#define USE_GNUTLS 1" >>confdefs.h + + ;; +no) + : + ;; +*) + as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5 + ;; + esac + +else + with_gnutls=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5 +$as_echo "$with_gnutls" >&6; } + + # # SELinux # @@ -10218,6 +10256,83 @@ done fi +if test "$with_gnutls" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5 +$as_echo_n "checking for library containing gnutls_init... " >&6; } +if ${ac_cv_search_gnutls_init+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char gnutls_init (); +int +main () +{ +return gnutls_init (); + ; + return 0; +} +_ACEOF +for ac_lib in '' gnutls; do + if test -z "$ac_lib"; then +ac_res="none required" + else +ac_res=-l$ac_lib +LIBS="-l$ac_lib $ac_func_search_save_LIBS" + fi + if ac_fn_c_try_link "$LINENO"; then : + ac_cv_search_gnutls_init=$ac_res +fi +rm -f core conftest.err conftest.$ac_objext \ +conftest$ac_exeext + if ${ac_cv_search_gnutls_init+:} false; then : + break +fi +done +if ${ac_cv_search_gnutls_init+:} false; then : + +else + ac_cv_search_gnutls_init=no +fi +rm conftest.$ac_ext +LIBS=$ac_func_search_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5 +$as_echo "$ac_cv_search_gnutls_init" >&6; } +ac_res=$ac_cv_search_gnutls_init +if test "$ac_res" != no; then : + test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" + +else + as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5 +fi + + # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted + # certificate chains. + ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include +#include + +" +if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then : + ac_have_decl=1 +else + ac_have_decl=0 +fi + +cat >>confdefs.h <<_ACEOF +#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl +_ACEOF + +fi + if test "$with_pam" = yes ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5 $as_echo_n "checking for pam_start in -lpam... " >&6; } @@ -11015,6 +11130,17 @@ else fi +fi + +if test "$with_gnutls" = yes ; then + ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default" +if test "x$ac_cv_header_gnutls_gnutls_h" = xyes; then : + +else + as_fn_error $? "header file is required for GnuTLS" "$LINENO" 5 +fi + + fi if test "$with_pam" = yes ; then @@ -15522,9 +15648,11 @@ fi # in the template or configure command line. # If not selected manually, try to select a source automatically. -if test "$enable_strong_random" = "yes" && test x"$USE_OPENSSL_RANDOM" = x"" && test x"$USE_WIN32_RANDOM" = x"" && test x"$USE_DEV_URANDOM" = x"" ; then +if test "$enable_strong_random" = "yes" && test
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
I have not looked at the issue with the btree_gin tests yet, but here is the first part of my review. = Review This is my first quick review where I just read the documentation and quickly tested the feature. I will review it more in-depth later. This is a very useful feature, one which I have a long time wished for. The patch applies, compiles and passes the test suite with just one warning. parse_coerce.c: In function ‘select_common_type_2args’: parse_coerce.c:1379:7: warning: statement with no effect [-Wunused-value] rightOid; ^~~~ = Functional The documentation does not agree with the code on the syntax. The documentation claims it is "FOREIGN KEY (ELEMENT xs) REFERENCES t1 (x)" when it actually is "FOREIGN KEY (EACH ELEMENT OF xs) REFERENCES t1 (x)". Likewise I can't get the "final_positions integer[] ELEMENT REFERENCES drivers" syntax to work, but here I cannot see any change in the syntax to support it. Related to the above: I am not sure if it is a good idea to make ELEMENT a reserved word in column definitions. What if the SQL standard wants to use it for something? The documentation claims ON CASCADE DELETE is not supported by array element foreign keys, but I do not think that is actually the case. I think I prefer (EACH ELEMENT OF xs) over (ELEMENT xs) given how the former is more in what I feel is the spirit of SQL. And if so we should match it as "xs integer[] EACH ELEMENT REFERENCES t1 (x)", assuming we want that syntax. Once I have created an array element foreign key the basic features seem to work as expected. The error message below fails to mention that it is an array element foreign key, but I do not think that is not a blocker for getting this feature merged. Right now I cannot think of how to improve it either. $ INSERT INTO t3 VALUES ('{1,3}'); ERROR: insert or update on table "t3" violates foreign key constraint "t3_xs_fkey" DETAIL: Key (xs)=({1,3}) is not present in table "t1". = Nitpicking/style comments In doc/src/sgml/catalogs.sgml the "conpfeqop" line is incorrectly indented. I am not fan of calling it "array-vs-scalar". What about array to scalar? In ddl.sgml date should be lower case like the other types in "race_day DATE,". In ddl.sgml I suggest removing the "..." from the examples to make it possible to copy paste them easily. Your text wrapping in ddl.sqml and create_table.sgqml is quite arbitrary. I suggest wrapping all paragraphs at 80 characters (except for code which should not be wrapped). Your text editor probably has tools for wrapping paragraphs. Please be consistent about how you write table names and SQL in general. I think almost all places use lower case for table names, while your examples in create_table.sgml are FKTABLEFORARRAY. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/21/2017 01:40 AM, Peter Geoghegan wrote: On Wed, Sep 20, 2017 at 4:08 PM, Peter Geogheganwrote: pg_import_system_collations() takes care to use the non-BCP-47 style for such versions, so I think this is working correctly. But CREATE COLLATION doesn't use pg_import_system_collations(). And perhaps more to the point: it highly confusing that we use one or the other of those 2 things ("langtag"/BCP 47 tag or "name"/legacy locale name) as "colcollate", depending on ICU version, thereby *behaving* as if ICU < 54 really didn't know anything about BCP 47 tags. Because, obviously earlier ICU versions know plenty about BCP 47, since 9 lines further down we use "langtag"/BCP 47 tag as collname for CollationCreate() -- regardless of ICU version. How can you say "ICU <54 doesn't even support the BCP 47 style", given all that? Those versions will still have locales named "*-x-icu" when users do "\dOS". Users will be highly confused when they quite reasonably try to generalize from the example in the docs and what "\dOS" shows, and get results that are wrong, often only in a very subtle way. If we are fine with supporting only ICU 4.2 and later (which I think we are given that ICU 4.2 was released in 2009) then using uloc_forLanguageTag()[1] to validate and canonize seems like the right solution. I had missed that this function even existed when I last read the documentation. Does it return a BCP 47 tag in modern versions of ICU? I strongly prefer if there, as much as possible, is only one format for inputting ICU locales. 1. http://www.icu-project.org/apiref/icu4c/uloc_8h.html#aa45d6457f72867880f079e27a63c6fcb Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GnuTLS support
On 09/07/2017 11:34 PM, Tomas Vondra wrote: I am worried about having 3x version of TLS controls in postgresql.conf, and only one set being active. Perhaps we need to break out the TLS config to separate files or something. Anyway, this needs more thought. Well, people won't be able to set the inactive options, just like you can't set ssl=on when you build without OpenSSL support. But perhaps we could simply not include the inactive options into the config file, no? Yeah, I have been thinking about how bad it would be to dynamically generate the config file. I think I will try this. Daniel: What options does Secure Transport need for configuring ciphers, ECDH, and cipher preference? Does it need any extra options (I think I saw something about the keychain)? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw super user checks
On 07/27/2017 09:45 PM, Jeff Janes wrote:> Here is an updated patch. This version allows you use the password-less connection if you either are the super-user directly (which is the existing committed behavior), or if you are using the super-user's mapping because you are querying a super-user-owned view which you have been granted access to. I have tested the patch and it passes the tests and works, and the code looks good (I have a small nitpick below). The feature seems useful, especially for people who already use views for security, so the question is if this is a potential footgun. I am leaning towards no since the superuser should be careful when grant access to is views anyway. It would have been nice if there was a more generic way to handle this since 1) the security issue is not unique to postgres_fdw and 2) this requires you to create a view. But since the patch is simple, an improvement in itself and does not prevent any future further improvements in this era I see no reason to let perfect be the enemy of good. = Nitpicking/style I would prefer if /* no check required if superuser */ if (superuser()) return; if (superuser_arg(user->userid)) return; was, for consistency with the if clause in connect_pg_server(), written as /* no check required if superuser */ if (superuser() || superuser_arg(user->userid)) return; Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generated columns
On 09/13/2017 04:04 AM, Simon Riggs wrote: On 31 August 2017 at 05:16, Peter Eisentrautwrote: - index support (and related constraint support) Presumably you can't index a VIRTUAL column. Or at least I don't think its worth spending time trying to make it work. I think end users would be surprised if one can index STORED columns and expressions but not VIRTUAL columns. So unless it is a huge project I would say it is worth it. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patches that don't apply or don't compile: 2017-09-12
On 09/12/2017 04:14 PM, Aleksander Alekseev wrote: Title: Foreign Key Arrays Author: Mark RofailURL:https://commitfest.postgresql.org/14/1252/ I am currently reviewing this one and it applies, compiles, and passes the test suite. It could be the compilation warnings which makes the system think it failed, but I could not find the log of the failed build. We want to be welcoming to new contributors so until we have a reliable CI server which can provide easy to read build logs I am against changing the status of any patches solely based on the result of the CI server. I think it should be used as a complimentary tool until the community deems it to be good enough. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
I have attached a new, rebased version of the batch with most of Banck's and some of your feedback incorporated. Thanks for the good feedback! On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX SCHEMA CONCURRENTLY public on the regression database I am bumping into a bunch of these warnings: WARNING: 01000: snapshot 0x7fa5e640 still active LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 WARNING: 01000: snapshot 0x7fa5e640 still active LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 I failed to reproduce this. Do you have a reproducible test case? + * Reset attcacheoff for a TupleDesc + */ +void +ResetTupleDescCache(TupleDesc tupdesc) +{ + int i; + + for (i = 0; i < tupdesc->natts; i++) + tupdesc->attrs[i]->attcacheoff = -1; +} I think that it would be better to merge that with TupleDescInitEntry to be sure that the initialization of a TupleDesc's attribute goes through only one code path. Sorry, but I am not sure I understand your suggestion. I do not like the ResetTupleDescCache function so all suggestions are welcome. -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name I am taking the war path with such a sentence... But what about adding CONCURRENTLY to the list of options in parenthesis instead? I have thought some about this myself and I do not care strongly either way. - Explore the use of SQL-level interfaces to mark an index as inactive at creation. - Remove work done in changeDependencyForAll, and replace it by something similar to what tablecmds.c does. There is I think here some place for refactoring if that's not with CREATE TABLE LIKE. This requires to the same work of creation, renaming and drop of the old triggers and constraints. I am no fan of the current code duplication and how fragile it is, but I think these cases are sufficiently different to prevent meaningful code reuse. But it could just be me who is unfamiliar with that part of the code. - Do a per-index rebuild and not a per-relation rebuild for concurrent indexing. Doing a per-relation reindex has the disadvantage that many objects need to be created at the same time, and in the case of REINDEX CONCURRENTLY time of the operation is not what matters, it is how intrusive the operation is. Relations with many indexes would also result in much object locks taken at each step. I am still leaning towards my current tradeoff since waiting for all queries to stop using an index can take a lot of time and if you only have to do that once per table it would be a huge benefit under some workloads, and you can still reindex each index separately if you need to. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index dda0170886..c97944b2c9 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -926,7 +926,7 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, - CREATE STATISTICS and + REINDEX CONCURRENTLY, CREATE STATISTICS and ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ). diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 3908ade37b..4ef3a89a29 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } An index build with the CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be - convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + convenient to use REINDEX to rebuild them. @@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + VERBOSE @@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + +
[HACKERS] GnuTLS support
Hi, I have seen discussions from time to time about OpenSSL and its licensing issues so I decided to see how much work it would be to add support for another TLS library, and I went with GnuTLS since it is the library I know best after OpenSSL and it is also a reasonably popular library. Attached is a work in progress patch which implements the basics. I send it the list because I want feedback on some design questions and to check with the community if this is a feature we want. = What is implemented - Backend - Frontend - Diffie-Hellmann support - Using GnuTLS for secure random numbers - Using GnuTLS for sha2 = Work left to do - Add GnuTLS support to sslinfo - Add GnuTLS support to pgcrypto - Support for GnuTLS's version of engines - Test code with older versions of GnuTLS - Update documentation - Add support for all postgresql.conf options (see design question) - Fix two failing tests (see design question) = Design questions == GnuTLS priority strings vs OpenSSL cipher lists GnuTLS uses a different format for specifying ciphers. Instead of setting ciphers, protocol versions, and ECDH curves separately GnuTLS instead uses a single priority string[1]. For example the default settings of PostgreSQL (which include disabling SSLv3) ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers ssl_prefer_server_ciphers = on ssl_ecdh_curve = 'prime256v1' is represented with a string similar to SECURE128:+3DES-CBC:+GROUP-SECP256R1:%SERVER_PRECEDENCE So the two libraries have decided on different ways to specify things. One way to solve th issue would be to just let ssl_ciphers be the priority string and then add %SERVER_PRECEDENCE to it if ssl_prefer_server_ciphers is on. Adding the ssl_ecdh_curve is trickier since the curves have different names in GnuTLS (e.g. prime256v1 vs SECP256R1) and I would rather avoid having to add such a mapping to PostgreSQL. Thoughts? == Potentially OpenSSL-specific est cases There are currently two failing SSL tests which at least to me seems more like they test specific OpenSSL behaviors rather than something which need to be true for all SSL libraries. The two tests: # Try with just the server CA's cert. This fails because the root file # must contain the whole chain up to the root CA. note "connect with server CA cert, without root CA"; test_connect_fails("sslrootcert=ssl/server_ca.crt sslmode=verify-ca"); # A CRL belonging to a different CA is not accepted, fails test_connect_fails( "sslrootcert=ssl/root+server_ca.crt sslmode=verify-ca sslcrl=ssl/client.crl"); For the missing root CA case GnuTLS seems to be happy enough with just an intermediate CA and as far as I can tell this behavior is not easy to configure. And for the CRL belonging to a different CA case GnuTLS can be configured to either just store such a non-validating CRL or to ignore it, but not to return an error. Personally I think thee two tests should just be removed but maybe I am missing something. Notes: 1. https://gnutls.org/manual/html_node/Priority-Strings.html Andreas diff --git a/configure b/configure index a2f9a256b4..8dcb26b532 100755 --- a/configure +++ b/configure @@ -709,6 +709,7 @@ UUID_EXTRA_OBJS with_uuid with_systemd with_selinux +with_gnutls with_openssl krb_srvtab with_python @@ -838,6 +839,7 @@ with_bsd_auth with_ldap with_bonjour with_openssl +with_gnutls with_selinux with_systemd with_readline @@ -1534,6 +1536,7 @@ Optional Packages: --with-ldap build with LDAP support --with-bonjour build with Bonjour support --with-openssl build with OpenSSL support + --with-gnutls build with GnuTS support --with-selinux build with SELinux support --with-systemd build with systemd support --without-readline do not use GNU Readline nor BSD Libedit for editing @@ -6051,6 +6054,41 @@ fi $as_echo "$with_openssl" >&6; } +# +# GnuTLS +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5 +$as_echo_n "checking whether to build with GnuTLS support... " >&6; } + + + +# Check whether --with-gnutls was given. +if test "${with_gnutls+set}" = set; then : + withval=$with_gnutls; + case $withval in +yes) + +$as_echo "#define USE_GNUTLS 1" >>confdefs.h + + ;; +no) + : + ;; +*) + as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5 + ;; + esac + +else + with_gnutls=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5 +$as_echo "$with_gnutls" >&6; } + + # # SELinux # @@ -10218,6 +10256,67 @@ done fi +if test "$with_gnutls" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5 +$as_echo_n "checking for library containing gnutls_init... " >&6; } +if ${ac_cv_search_gnutls_init+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h -
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
On 11/10/2017 01:47 AM, Mark Rofail wrote: I am sorry for the late reply There is no reason for you to be. It did not take you 6 weeks to do a review. :) Thanks for this new version. == Functional review >1) MATCH FULL does not seem to care about NULLS in arrays. In the example below I expected both inserts into the referring table to fail. It seems in your example the only failed case was: INSERT INTO fk VALUES (NULL, '{1}'); which shouldn't work, can you clarify this? I think that if you use MATH FULL the query should fail if you have a NULL in the array. >2) To me it was not obvious that ON DELETE CASCADE would delete the whole rows rather than delete the members from the array, and this kind of misunderstanding can lead to pretty bad surprises in production. I am leaning towards not supporting CASCADE. I would say so too, maybe we should remove ON DELETE CASCADE until we have supported all remaining actions. I am leaning towards this too. I would personally be fine with a first version without support for CASCADE since it is not obvious to me what CASCADE should do. == The @>> operator I would argue that allocating an array of datums and building an array would have the same complexity I am not sure what you mean here. Just because something has the same complexity does not mean there can't be major performance differences. == Code review >I think the code in RI_Initial_Check() would be cleaner if you used "CROSS JOIN LATERAL unnest(col)" rather than having unnest() in the target list. This way you would not need to rename all columns and the code paths for the array case could look more like the code path for the normal case. Can you clarify what you mean a bit more? I think the code would look cleaner if you generate the following query: SELECT fk.x, fk.ys FROM ONLY t2 fk CROSS JOIN LATERAL pg_catalog.unnest(ys) a2 (v) LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.x AND pk.y = a2.v WHERE [...] rather than: SELECT fk.k1, fk.ak2 FROM (SELECT x k1, pg_catalog.unnest(ys) k2, ys ak2 FROM ONLY t2) fk LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.k1 AND pk.y = fk.k2 WHERE [...] = New stuff When applying the patch I got some white space warnings: Array-ELEMENT-foreign-key-v5.3.patch:1343: space before tab in indent, indent with spaces. format_type_be(oprleft), format_type_be(oprright; Array-ELEMENT-foreign-key-v5.3.patch:1345: trailing whitespace. When compiling I got an error: ri_triggers.c: In function ‘ri_GenerateQual’: ri_triggers.c:2693:19: error: unknown type name ‘d’ Oid oprcommon;d ^ ri_triggers.c:2700:3: error: conflicting types for ‘oprright’ oprright = get_array_type(operform->oprleft); ^~~~ ri_triggers.c:2691:9: note: previous declaration of ‘oprright’ was here Oid oprright; ^~~~ : recipe for target 'ri_triggers.o' failed When building the documentation I got two warnings: /usr/bin/osx:catalogs.sgml:2349:17:W: empty end-tag /usr/bin/osx:catalogs.sgml:2350:17:W: empty end-tag When running the tests I got a failure in element_foreign_key. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GnuTLS support
On 09/18/2017 07:04 PM, Jeff Janes wrote:> You fixed the first issue, but I still get the second one: be-secure-gnutls.c: In function 'get_peer_certificate': be-secure-gnutls.c:667: error: 'GNUTLS_X509_CRT_LIST_SORT' undeclared (first use in this function) be-secure-gnutls.c:667: error: (Each undeclared identifier is reported only once be-secure-gnutls.c:667: error: for each function it appears in.) Thanks again for testing the code. I have now rebased the patch and fixed the second issue. I tested that it works on CentOS 6. Work which remains: - sslinfo - pgcrypto - Documentation - Decide if what I did with the config is a good idea Andreas diff --git a/configure b/configure index 4ecd2e1922..1ba34dfced 100755 --- a/configure +++ b/configure @@ -709,6 +709,7 @@ UUID_EXTRA_OBJS with_uuid with_systemd with_selinux +with_gnutls with_openssl krb_srvtab with_python @@ -837,6 +838,7 @@ with_bsd_auth with_ldap with_bonjour with_openssl +with_gnutls with_selinux with_systemd with_readline @@ -1531,6 +1533,7 @@ Optional Packages: --with-ldap build with LDAP support --with-bonjour build with Bonjour support --with-openssl build with OpenSSL support + --with-gnutls build with GnuTS support --with-selinux build with SELinux support --with-systemd build with systemd support --without-readline do not use GNU Readline nor BSD Libedit for editing @@ -5997,6 +6000,41 @@ fi $as_echo "$with_openssl" >&6; } +# +# GnuTLS +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5 +$as_echo_n "checking whether to build with GnuTLS support... " >&6; } + + + +# Check whether --with-gnutls was given. +if test "${with_gnutls+set}" = set; then : + withval=$with_gnutls; + case $withval in +yes) + +$as_echo "#define USE_GNUTLS 1" >>confdefs.h + + ;; +no) + : + ;; +*) + as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5 + ;; + esac + +else + with_gnutls=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5 +$as_echo "$with_gnutls" >&6; } + + # # SELinux # @@ -10164,6 +10202,94 @@ done fi +if test "$with_gnutls" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5 +$as_echo_n "checking for library containing gnutls_init... " >&6; } +if ${ac_cv_search_gnutls_init+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char gnutls_init (); +int +main () +{ +return gnutls_init (); + ; + return 0; +} +_ACEOF +for ac_lib in '' gnutls; do + if test -z "$ac_lib"; then +ac_res="none required" + else +ac_res=-l$ac_lib +LIBS="-l$ac_lib $ac_func_search_save_LIBS" + fi + if ac_fn_c_try_link "$LINENO"; then : + ac_cv_search_gnutls_init=$ac_res +fi +rm -f core conftest.err conftest.$ac_objext \ +conftest$ac_exeext + if ${ac_cv_search_gnutls_init+:} false; then : + break +fi +done +if ${ac_cv_search_gnutls_init+:} false; then : + +else + ac_cv_search_gnutls_init=no +fi +rm conftest.$ac_ext +LIBS=$ac_func_search_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5 +$as_echo "$ac_cv_search_gnutls_init" >&6; } +ac_res=$ac_cv_search_gnutls_init +if test "$ac_res" != no; then : + test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" + +else + as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5 +fi + + # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted + # certificate chains. + ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include +#include + +" +if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then : + ac_have_decl=1 +else + ac_have_decl=0 +fi + +cat >>confdefs.h <<_ACEOF +#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl +_ACEOF + + for ac_func in gnutls_pkcs11_set_pin_function +do : + ac_fn_c_check_func "$LINENO" "gnutls_pkcs11_set_pin_function" "ac_cv_func_gnutls_pkcs11_set_pin_function" +if test "x$ac_cv_func_gnutls_pkcs11_set_pin_function" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_GNUTLS_PKCS11_SET_PIN_FUNCTION 1 +_ACEOF + +fi +done + +fi + if test "$with_pam" = yes ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5 $as_echo_n "checking for pam_start in -lpam... " >&6; } @@ -10961,6 +11087,17 @@ else fi +fi + +if test "$with_gnutls" = yes ; then + ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default" +if test
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Sorry for the very late review. I like this feature and have needed it myself in the past, and the current syntax seems pretty good. One could argue for if the syntax could be generalized to support other things like json and hstore, but I do not think it would be fair to block this patch due to that. == Limitations of the current design 1) Array element foreign keys can only be specified at the table level (not at columns): I think this limitation is fine. Other PostgreSQL specific features like exclusion contraints can also only be specified at the table level. 2) Lack of support for SET NULL and SET DEFAULT: these do not seem very useful for arrays. 3) Lack of support for specifiying multiple arrays in the foreign key: seems like a good thing to me since it is not obvious what such a thing even would do. 4) That you need to add a cast to the index if you have different types: due to there not being a int4[] <@ int2[] operator you need to add an index on (col::int4[]) to speed up deletes and updates. This one i annoying since EXPLAIN wont give you the query plans for the foreign key queries, but I do not think fixing this should be within the scope of the patch and that having a smaller interger in the referring table is rare. 5) The use of count(DISTINCT) requiring types to support btree equality: this has been discussed a lot up-thread and I think the current state is good enough. == Functional review I have played around some with it and things seem to work and the test suite passes, but I noticed a couple of strange behaviors. 1) MATCH FULL does not seem to care about NULLS in arrays. In the example below I expected both inserts into the referring table to fail. CREATE TABLE t (x int, y int, PRIMARY KEY (x, y)); CREATE TABLE fk (x int, ys int[], FOREIGN KEY (x, EACH ELEMENT OF ys) REFERENCES t MATCH FULL); INSERT INTO t VALUES (10, 1); INSERT INTO fk VALUES (10, '{1,NULL}'); INSERT INTO fk VALUES (NULL, '{1}'); CREATE TABLE CREATE TABLE INSERT 0 1 INSERT 0 1 ERROR: insert or update on table "fk" violates foreign key constraint "fk_x_fkey" DETAIL: MATCH FULL does not allow mixing of null and nonnull key values. 2) To me it was not obvious that ON DELETE CASCADE would delete the whole rows rather than delete the members from the array, and this kind of misunderstanding can lead to pretty bad surprises in production. I am leaning towards not supporting CASCADE. == The @>> operator A previous version of your patch added the "anyelement <<@ anyarray" operator to avoid having to build arrays, but that part was reverted due to a bug. I am not expert on the gin code, but as far as I can tell it would be relatively simple to fix that bug. Just allocate an array of Datums of length one where you put the element you are searching for (or maybe a copy of it). Potential issues with adding the operators: 1) Do we really want to add an operator just for array element foreign keys? I think this is not an issue since it seems like it should be useful in general. I know I have wanted it myself. 2) I am not sure, but the committers might prefer if adding the operators is done in a separate patch. 3) Bikeshedding about operator names. I personally think @>> is clear enough and as far as I know it is not used for anything else. == Code review The patch no longer applies to HEAD, but the conflicts are small. I think we should be more consistent in the naming, both in code and in the documentation. Right now we have "array foreign keys", "element foreign keys", "ELEMENT foreign keys", etc. + /* +* If this is an array foreign key, we must look up the operators for +* the array element type, not the array type itself. +*/ + if (fkreftypes[i] != FKCONSTR_REF_PLAIN) + if (fkreftypes[i] != FKCONSTR_REF_PLAIN) + { + old_fktype = get_base_element_type(old_fktype); + /* this shouldn't happen ... */ + if (!OidIsValid(old_fktype)) + elog(ERROR, "old foreign key column is not an array"); + } + if (riinfo->fk_reftypes[i] != FKCONSTR_REF_PLAIN) + { + riinfo->has_array = true; + riinfo->ff_eq_oprs[i] = ARRAY_EQ_OP; + } In the three diffs above it would be much cleaner to check for "== FKCONSTR_REF_EACH_ELEMENT" since that better conveys the intent and is safer for adding new types in the future. + /* We look through any domain here */ + fktype = get_base_element_type(fktype); What does the comment above mean? if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), -errmsg("foreign key constraint \"%s\" " - "cannot be implemented", - fkconstraint->conname), -errdetail("Key columns \"%s\" and
Re: [HACKERS] git down
On 10/27/2017 10:51 PM, Erik Rijkers wrote: git.postgresql.org is down/unreachable ( git://git.postgresql.org/git/postgresql.git ) Yes, I noticed this too, but https://git.postgresql.org/git/postgresql.git still works fine. I guess it makes sense to remove unencrypted access, but in that case https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary should not advertise supporting the git protocol. I have not seen any announcement either, but that could just be me not paying enough attention. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Here is a rebased version of the patch. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index a0ca2851e5..f8c59ea127 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -926,6 +926,7 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, + REINDEX CONCURRENTLY, CREATE STATISTICS and ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 2e053c4c24..4019bad4c2 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } An index build with the CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be - convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + convenient to use REINDEX to rebuild them. @@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + VERBOSE @@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + + + +Rebuilding an index can interfere with regular operation of a database. +Normally PostgreSQL locks the table whose index is rebuilt +against writes and performs the entire index build with a single scan of the +table. Other transactions can still read the table, but if they try to +insert, update, or delete rows in the table they will block until the +index rebuild is finished. This could have a severe effect if the system is +a live production database. Very large tables can take many hours to be +indexed, and even for smaller tables, an index rebuild can lock out writers +for periods that are unacceptably long for a production system. + + + +PostgreSQL supports rebuilding indexes with minimum locking +of writes. This method is invoked by specifying the +CONCURRENTLY option of REINDEX. When this option +is used, PostgreSQL must perform two scans of the table +for each index that needs to be rebuild and in addition it must wait for +all existing transactions that could potentially use the index to +terminate. This method requires more total work than a standard index +rebuild and takes significantly longer to complete as it needs to wait +for unfinished transactions that might modify the index. However, since +it allows normal operations to continue while the index is rebuilt, this +method is useful for rebuilding indexes in a production environment. Of +course, the extra CPU, memory and I/O load imposed by the index rebuild +may slow down other operations. + + + +The following steps occur in a concurrent index build, each in a separate +transaction except when the new index definitions are created, where all +the concurrent entries are created using only one transaction. Note that +if there are multiple indexes to be rebuilt then each step loops through +all the indexes we're rebuilding, using a separate transaction for each one. +REINDEX CONCURRENTLY proceeds as follows when rebuilding +indexes: + + + + + A new temporary index definition is added into the catalog + pg_index. This definition will be used to replace the + old index. This step is done as a single transaction for all the indexes + involved in this process, meaning that if + REINDEX CONCURRENTLY is run on a table with multiple + indexes, all the catalog entries of the new indexes are created within a + single transaction. A SHARE UPDATE EXCLUSIVE lock at + session level is taken on the indexes being reindexed as well as its + parent table to prevent any schema modification while processing. + + + + + A first