Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Andreas Karlsson

On 03/19/2017 07:35 PM, Stephen Frost wrote:

* Tom Lane (t...@sss.pgh.pa.us) wrote:

Stephen Frost  writes:
(Or in other words, we've been getting along fine with these script names
for circa twenty years, so what's the rush to change them RIGHT NOW?)


To be clear, I'm not in any particular rush to change them 'RIGHT NOW'.
I tend to agree with Magnus that we're doing a lot of other things in
PG10 and that makes it a bit of a natural point, but I don't hold that
position terribly strongly.  On the other hand, I do not relish the idea
of providing backwards-compatibility for every user-facing change we do
for 5 years and that's where I feel this approach is encouraging us to
go.


I only think that argument is only applicable where the changes are 
closely related, e.g. renaming pg_clog, pg_xlog and pg_log in the same 
release. I do not see any strong connection between createuser and pg_xlog.


As for if we should have backwards compatibility for the old names I am 
leaning weakly for providing it in the case of createuser. I can see end 
users being pissed off that the createuser command is suddenly gone 
without any warning when they upgrade. On the flip side I have no idea 
how much work it would be to maintain those legacy names.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ICU integration

2017-03-19 Thread Andreas Karlsson

On 03/15/2017 05:33 PM, Peter Eisentraut wrote:

Updated patch attached.


Ok, I applied to patch again and ran the tests. I get a test failure on 
make check-world in the pg_dump tests but it can be fixed with the below.


diff --git a/src/bin/pg_dump/t/002_pg_dump.pl 
b/src/bin/pg_dump/t/002_pg_dump.pl

index 3cac4a9ae0..7d9c90363b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -2422,7 +2422,7 @@ qr/^\QINSERT INTO test_fifth_table (col1, col2, 
col3, col4, col5) VALUES (NULL,

  'CREATE COLLATION test0 FROM "C";',
regexp =>
  qr/^
- \QCREATE COLLATION test0 (lc_collate = 'C', lc_ctype = 'C');\E/xm,
+ \QCREATE COLLATION test0 (provider = 'libc', locale = 'C');\E/xm,
like => {
binary_upgrade   => 1,
clean=> 1,


- I do not like how it is not obvious which is the default version of
every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not
"sv_standard%icu" as one might expect. Is this inherent to ICU or
something we can work around?


We get these keywords from ucol_getKeywordValuesForLocale(), which says
"Given a key and a locale, returns an array of string values in a
preferred order that would make a difference."  So all those are
supposedly different from each other.


I believe you are mistaken. The locale "sv" is just an alias for 
"sv-u-standard" as far as I can tell. See the definition of the Swedish 
locale 
(http://source.icu-project.org/repos/icu/trunk/icu4c/source/data/coll/sv.txt) 
and there are just three collations: reformed (default), standard, 
search (plus eot and emoji which are inherited).


I am also quite annoyed at col-emoji and col-eor (European Ordering 
Rules). They are defined at the root and inherited by all languages, but 
no language modifies col-emoji for their needs which makes it a foot 
gun. See the Danish sorting example below where at least I expected the 
same order. For col-eor it makes a bit more sense since I would expect 
the locales en_GB-u-col-eot and sv_SE-u-col-eot to collate exactly the same.


# SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE 
"da-x-icu";

 x

 a
 k
 aa
(3 rows)

# SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE 
"da-u-co-emoji-x-icu";

 x

 a
 aa
 k
(3 rows)

It seems ICU has made quite the mess here, and I am not sure to what 
degree we need to handle it to avoid our users getting confused. I need 
to think some of it, and would love input from others. Maybe the right 
thing to do is to ignore the issue with col-emoji, but I would want to 
do something about the default collations.



- ICU talks about a new syntax for locale extensions (old:
de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page
http://userguide.icu-project.org/collation/api. Is this something we
need to car about? It looks like we currently use the old format, and
while I personally prefer it I am not sure we should rely on an old syntax.


Interesting.  I hadn't see this before, and the documentation is sparse.
 But it seems that this refers to BCP 47 language tags, which seem like
a good idea.

So what I have done is change all the predefined ICU collations to be
named after the BCP 47 scheme, with a "private use" -x-icu suffix
(instead of %icu).  The preserves the original idea but uses a standard
naming scheme.

I'm not terribly worried that they are going to remove the "old" locale
naming, but just to be forward looking, I have changed it so that the
collcollate entries are made using the "new" naming for ICU >=54.


Sounds good.


- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR:  could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR:  could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms


Hmm, that's pretty straightforward code.  What is your operating system?
 What are the build options?  Does it work without this patch?


This issue is unrelated to ICU. I had forgot to specify provider so the 
eorrs are correct (even though that the value of the errno is weird).



- For the collprovider is it really correct to say that 'd' is the
default value as it does in catalogs.sgml?


It doesn't say it's the default value, it says it uses the database
default.  This is all a bit confusing.  We have a collation object named
"default", which uses the locale set for the database.  That's been that
way for a while.  Now when introducing the collation providers, that
"default" collation gets its own collprovider category 'd'.  That is not
really used anywhere, but we have to put something there.


Ah, I see now. Hm, that is a bit awkward but I cannot think of a cleaner 
alternative.



- I do not know what the policy for formatting the documentation is, but
some of the paragraphs are in need of re-wrapping.


Hmm, I don't see anything terribly bad.


Maybe it is just me 

Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-15 Thread Andreas Karlsson

Hi,

I got a test failure with this version of the patch in the postges_fdw. 
It looks to me like it was caused by a typo in the source code which is 
fixed in the attached patch.


After applying this patch check-world passes.

Andreas
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 1763be5cf0..2ba5a2ea69 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -42,7 +42,6 @@ static void TidListCreate(TidScanState *tidstate);
 static int	itemptr_comparator(const void *a, const void *b);
 static TupleTableSlot *TidNext(TidScanState *node);
 
-
 /*
  * Compute the list of TIDs to be visited, by evaluating the expressions
  * for them.
@@ -101,6 +100,7 @@ TidListCreate(TidScanState *tidstate)
 			else if (IsCTIDVar(arg2))
 exprstate = ExecInitExpr((Expr *) linitial(fex->args),
 		  >ss.ps);
+			else
 elog(ERROR, "could not identify CTID variable");
 
 			itemptr = (ItemPointer)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \h tab-completion

2017-03-15 Thread Andreas Karlsson

On 03/01/2017 02:47 PM, Peter Eisentraut wrote:

Instead of creating another copy of list_ALTER, let's use the
words_after_create list and write a version of
create_command_generator/drop_command_generator.


Good idea. Here is a patch with that.

Andreas

commit 7d691929f5814da87bb8a532e7dcfa2bd1dc9f15
Author: Andreas Karlsson <andr...@proxel.se>
Date:   Fri Feb 3 13:05:48 2017 +0100

Add compleition for \help DROP|ALTER

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e8458e939e..3df7636c5b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -982,10 +982,11 @@ typedef struct
 
 #define THING_NO_CREATE		(1 << 0)	/* should not show up after CREATE */
 #define THING_NO_DROP		(1 << 1)	/* should not show up after DROP */
-#define THING_NO_SHOW		(THING_NO_CREATE | THING_NO_DROP)
+#define THING_NO_ALTER		(1 << 2)	/* should not show up after ALTER */
+#define THING_NO_SHOW		(THING_NO_CREATE | THING_NO_DROP | THING_NO_ALTER)
 
 static const pgsql_thing_t words_after_create[] = {
-	{"ACCESS METHOD", NULL, NULL},
+	{"ACCESS METHOD", NULL, NULL, THING_NO_ALTER},
 	{"AGGREGATE", NULL, _for_list_of_aggregates},
 	{"CAST", NULL, NULL},		/* Casts have complex structures for names, so
  * skip it */
@@ -999,6 +1000,7 @@ static const pgsql_thing_t words_after_create[] = {
 	{"CONVERSION", "SELECT pg_catalog.quote_ident(conname) FROM pg_catalog.pg_conversion WHERE substring(pg_catalog.quote_ident(conname),1,%d)='%s'"},
 	{"DATABASE", Query_for_list_of_databases},
 	{"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW},
+	{"DEFAULT PRIVILEGES", NULL, NULL, THING_NO_CREATE | THING_NO_DROP},
 	{"DOMAIN", NULL, _for_list_of_domains},
 	{"EVENT TRIGGER", NULL, NULL},
 	{"EXTENSION", Query_for_list_of_extensions},
@@ -1006,12 +1008,13 @@ static const pgsql_thing_t words_after_create[] = {
 	{"FOREIGN TABLE", NULL, NULL},
 	{"FUNCTION", NULL, _for_list_of_functions},
 	{"GROUP", Query_for_list_of_roles},
-	{"LANGUAGE", Query_for_list_of_languages},
 	{"INDEX", NULL, _for_list_of_indexes},
+	{"LANGUAGE", Query_for_list_of_languages, NULL, THING_NO_ALTER},
+	{"LARGE OBJECT", NULL, NULL, THING_NO_CREATE | THING_NO_DROP},
 	{"MATERIALIZED VIEW", NULL, _for_list_of_matviews},
 	{"OPERATOR", NULL, NULL},	/* Querying for this is probably not such a
  * good idea. */
-	{"OWNED", NULL, NULL, THING_NO_CREATE},		/* for DROP OWNED BY ... */
+	{"OWNED", NULL, NULL, THING_NO_CREATE | THING_NO_ALTER},		/* for DROP OWNED BY ... */
 	{"PARSER", Query_for_list_of_ts_parsers, NULL, THING_NO_SHOW},
 	{"POLICY", NULL, NULL},
 	{"PUBLICATION", Query_for_list_of_publications},
@@ -1021,15 +1024,16 @@ static const pgsql_thing_t words_after_create[] = {
 	{"SEQUENCE", NULL, _for_list_of_sequences},
 	{"SERVER", Query_for_list_of_servers},
 	{"SUBSCRIPTION", Query_for_list_of_subscriptions},
+	{"SYSTEM", NULL, NULL, THING_NO_CREATE | THING_NO_DROP},
 	{"TABLE", NULL, _for_list_of_tables},
 	{"TABLESPACE", Query_for_list_of_tablespaces},
-	{"TEMP", NULL, NULL, THING_NO_DROP},		/* for CREATE TEMP TABLE ... */
+	{"TEMP", NULL, NULL, THING_NO_DROP | THING_NO_ALTER},		/* for CREATE TEMP TABLE ... */
 	{"TEMPLATE", Query_for_list_of_ts_templates, NULL, THING_NO_SHOW},
 	{"TEXT SEARCH", NULL, NULL},
 	{"TRIGGER", "SELECT pg_catalog.quote_ident(tgname) FROM pg_catalog.pg_trigger WHERE substring(pg_catalog.quote_ident(tgname),1,%d)='%s' AND NOT tgisinternal"},
 	{"TYPE", NULL, _for_list_of_datatypes},
-	{"UNIQUE", NULL, NULL, THING_NO_DROP},		/* for CREATE UNIQUE INDEX ... */
-	{"UNLOGGED", NULL, NULL, THING_NO_DROP},	/* for CREATE UNLOGGED TABLE
+	{"UNIQUE", NULL, NULL, THING_NO_DROP | THING_NO_ALTER},		/* for CREATE UNIQUE INDEX ... */
+	{"UNLOGGED", NULL, NULL, THING_NO_DROP | THING_NO_ALTER},	/* for CREATE UNLOGGED TABLE
  * ... */
 	{"USER", Query_for_list_of_roles},
 	{"USER MAPPING FOR", NULL, NULL},
@@ -1042,6 +1046,7 @@ static const pgsql_thing_t words_after_create[] = {
 static char **psql_completion(const char *text, int start, int end);
 static char *create_command_generator(const char *text, int state);
 static char *drop_command_generator(const char *text, int state);
+static char *alter_command_generator(const char *text, int state);
 static char *complete_from_query(const char *text, int state);
 static char *complete_from_schema_query(const char *text, int state);
 static char *_complete_from_query(int is_schema_query,
@@ -1316,6 +1321,17 @@ psql_completion(const char *text, i

Re: [HACKERS] ICU integration

2017-03-14 Thread Andreas Karlsson

On 03/09/2017 10:13 PM, Peter Eisentraut wrote:

- Naming of collations:  Are we happy with the "de%icu" naming?  I might
have come up with that while reviewing the IPv6 zone index patch. ;-)
An alternative might be "de$icu" for more Oracle vibe and avoiding the
need for double quotes in some cases.  (But we have mixed-case names
like "de_AT%icu", so further changes would be necessary to fully get rid
of the need for quoting.)  A more radical alternative would be to
install ICU locales in a different schema and use the search_path, or
even have a separate search path setting for collations only.  Which
leads to ...


I do not like the schema based solution since search_path already gives 
us enough headaches. As for the naming I am fine with the current scheme.


Would be nice with something we did not have to quote, but I do not like 
using dollar signs since they are already use for other things.



- Selecting default collation provider:  Maybe we want a setting, say in
initdb, to determine which provider's collations get the "good" names?
Maybe not necessary for this release, but something to think about.


This does not seem necessary for the initial release.


- Currently (in this patch), we check a collation's version when it is
first used.  But, say, after pg_upgrade, you might want to check all of
them right away.  What might be a good interface for that?  (Possibly,
we only have to check the ones actually in use, and we have dependency
information for that.)


How about adding a SQL level function for checking this which can be 
called by pg_upgrade?


= Review

Here is an initial review. I will try to find some time to do more 
testing later this week.


This is a really useful feature given the poor quality of collation 
support libc. Just that ICU versions the encodings is huge, and the 
larger range of available collations with high configurability. 
Additionally being able to use abbreviated keys again would be huge.


ICU also supports writing your own collations and modifying existing 
ones, something which might be possible to expose in the future. In 
general ICU offers a lot for users who care about the details of text 
collation.


== Functional

- Generally things seem to work fine and as expected.

- I get a test failures in the default test suite due to not having the 
tr_TR locale installed. I would assume that this would be pretty common 
for hackers.


***
*** 428,443 

  -- to_char
  SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char
  -
!  01 NIS 2010
  (1 row)

  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr%icu");
 to_char
  -
!  01 NİS 2010
  (1 row)

  -- backwards parsing
--- 428,444 

  -- to_char
  SET lc_time TO 'tr_TR';
+ ERROR:  invalid value for parameter "lc_time": "tr_TR"
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char
  -
!  01 APR 2010
  (1 row)

  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr%icu");
 to_char
  -
!  01 APR 2010
  (1 row)

  -- backwards parsing

==

- The code no longer compiles without HAVE_LOCALE_T.

- I do not like how it is not obvious which is the default version of 
every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not 
"sv_standard%icu" as one might expect. Is this inherent to ICU or 
something we can work around?


- ICU talks about a new syntax for locale extensions (old: 
de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page 
http://userguide.icu-project.org/collation/api. Is this something we 
need to car about? It looks like we currently use the old format, and 
while I personally prefer it I am not sure we should rely on an old syntax.


- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR:  could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR:  could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms

== Code review

- For the collprovider is it really correct to say that 'd' is the 
default value as it does in catalogs.sgml?


- I do not know what the policy for formatting the documentation is, but 
some of the paragraphs are in need of re-wrapping.


- Add a hint to "ERROR:  conflicting or redundant options". The error 
message is pretty unclear.


- I am not a fan of this patch comparing the encoding with a -1 literal. 
How about adding -1 as a value to the enum? See the example below for a 
place where the patch compares with -1.


ereport(NOTICE,
(errcode(ERRCODE_DUPLICATE_OBJECT),
-errmsg("collation \"%s\" for encoding \"%s\" already 
exists, skipping",

-   collname, pg_encoding_to_char(collencoding;
+collencoding == -1
+? errmsg("collation \"%s\" already exists, skipping",
+   

Re: [HACKERS] \h tab-completion

2017-03-16 Thread Andreas Karlsson

On 03/17/2017 12:01 AM, Peter Eisentraut wrote:

Committed with some tweaking.


Thanks!

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-04-03 Thread Andreas Karlsson

On 04/03/2017 07:57 AM, Michael Paquier wrote:

On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andr...@proxel.se> wrote:

On 03/31/2017 08:27 AM, Michael Paquier wrote:

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.


I am personally worried about the amount time spent waiting for long running
transactions if you reindex per index rather than per relation. Because when
you for one index wait on long running transactions nothing prevents new
long transaction from starting, which we will have to wait for while
reindexing the next index. If your database has many long running
transactions more time will be spent waiting than the time spent working.


Yup, I am not saying that one approach or the other are bad, both are
worth considering. That's a deal between waiting and manual potential
cleanup in the event of a failure.


Agreed, and which is worse probably depends heavily on your schema and 
workload.



I am marking this patch as returned with feedback, this won't get in
PG10. If I am freed from the SCRAM-related open items I'll try to give
another shot at implementing this feature before the first CF of PG11.


Thanks! I also think I will have time to work on this before the first CF.

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-12 Thread Andreas Karlsson

On 04/12/2017 04:12 PM, Tom Lane wrote:

1. The best thing would still be to make genbki.pl do the conversion,
and write numeric OIDs into postgres.bki.  The core stumbling block
here seems to be that for most catalogs, Catalog.pm and genbki.pl
never really break down a DATA line into fields --- and we certainly
have got to do that, if we're going to replace the values of regproc
fields.  The places that do need to do that approximate it like this:

# To construct fmgroids.h and fmgrtab.c, we need to inspect some
# of the individual data fields.  Just splitting on whitespace
# won't work, because some quoted fields might contain internal
# whitespace.  We handle this by folding them all to a simple
# "xxx". Fortunately, this script doesn't need to look at any
# fields that might need quoting, so this simple hack is
# sufficient.
$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
@{$row}{@attnames} = split /\s+/, $row->{bki_values};

We would need a bullet-proof, non-hack, preferably not too slow way to
split DATA lines into fields properly.  I'm one of the world's worst
Perl programmers, but surely there's a way?

2. Alternatively, we could teach bootstrap mode to build a hashtable
mapping proname to OID while it reads pg_proc data from postgres.bki,
and then make regprocin's bootstrap path consult the hashtable instead
of looking directly at the pg_proc file.  That I'm quite sure is do-able,
but it seems like it's leaving money on the table compared to doing
the transformation earlier.

Thoughts?


Looked at this an option 1 seems simple enough if I am not missing 
something. I might hack something up later tonight. Either way I think 
this improvement can be done separately from the proposed replacement of 
the catalog header files. Trying to fix everything at once often leads 
to nothing being fixed at all.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-12 Thread Andreas Karlsson

On 04/12/2017 05:00 PM, Andreas Karlsson wrote:

Looked at this an option 1 seems simple enough if I am not missing
something. I might hack something up later tonight. Either way I think
this improvement can be done separately from the proposed replacement of
the catalog header files. Trying to fix everything at once often leads
to nothing being fixed at all.


Here is my proof of concept patch. It does basically the same thing as 
Andres's patch except that it handles quoted values a bit better and 
does not try to support anything other than the regproc type.


The patch speeds up initdb without fsync from 0.80 seconds to 0.55 
seconds, which is a nice speedup, while adding a negligible amount of 
extra work on compilation.


Two things which could be improved in this patch if people deem it 
important:


- Refactor the code to be more generic, like Andres patch, so it is 
easier to add lookups for other data types.


- Write something closer to a real .bki parser to actually understand 
quoted values and _null_ when parsing the data. Right now this patch 
only splits each row into its fields (while being aware of quotes) but 
does not attempt to remove quotes. The PoC patch treats "foo" and foo as 
different.


Andreas
commit bc8ede661b12407c0b6a9e4c0932d21028f87fc1
Author: Andreas Karlsson <andr...@proxel.se>
Date:   Wed Apr 12 21:00:49 2017 +0200

WIP: Resolve regproc entires when generating .bki

diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 6e9d57aa8d..f918c9ef8a 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -102,6 +102,7 @@ print $bki "# PostgreSQL $major_version\n";
 # vars to hold data needed for schemapg.h
 my %schemapg_entries;
 my @tables_needing_macros;
+my %procs;
 our @types;
 
 # produce output, one catalog at a time
@@ -164,20 +165,41 @@ foreach my $catname (@{ $catalogs->{names} })
 			$row->{bki_values} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g;
 			$row->{bki_values} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g;
 
+			# Split values into tokens without interpreting their meaning.
+			my %bki_values;
+			@bki_values{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g;
+
+			# Substitute regproc entires with oids
+			foreach my $att (keys %bki_values)
+			{
+next if $bki_attr{$att}->{type} ne 'regproc';
+next if $bki_values{$att} =~ /\A(\d+|-|_null_)\z/;
+
+$bki_values{$att} = $procs{$bki_values{$att}};
+			}
+
+			# Save pg_proc oids for use by later catalogs. This relies on the
+			# order we process the files, but the bootstrap code also relies on
+			# pg_proc being loaded first.
+			if ($catname eq 'pg_proc')
+			{
+$procs{$bki_values{proname}} = $row->{oid};
+			}
+
 			# Save pg_type info for pg_attribute processing below
 			if ($catname eq 'pg_type')
 			{
-my %type;
+my %type = %bki_values;
 $type{oid} = $row->{oid};
-@type{@attnames} = split /\s+/, $row->{bki_values};
 push @types, \%type;
 			}
 
 			# Write to postgres.bki
 			my $oid = $row->{oid} ? "OID = $row->{oid} " : '';
-			printf $bki "insert %s( %s)\n", $oid, $row->{bki_values};
+			printf $bki "insert %s( %s)\n", $oid,
+			  join(' ', @bki_values{@attnames});
 
-		   # Write comments to postgres.description and postgres.shdescription
+			# Write comments to postgres.description and postgres.shdescription
 			if (defined $row->{descr})
 			{
 printf $descr "%s\t%s\t0\t%s\n", $row->{oid}, $catname,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index 2af9b355e7..10d9c6abc7 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -58,16 +58,9 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} })
 my $data = $catalogs->{pg_proc}->{data};
 foreach my $row (@$data)
 {
-
-	# To construct fmgroids.h and fmgrtab.c, we need to inspect some
-	# of the individual data fields.  Just splitting on whitespace
-	# won't work, because some quoted fields might contain internal
-	# whitespace.  We handle this by folding them all to a simple
-	# "xxx". Fortunately, this script doesn't need to look at any
-	# fields that might need quoting, so this simple hack is
-	# sufficient.
-	$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
-	@{$row}{@attnames} = split /\s+/, $row->{bki_values};
+	# Split values into tokens without interpreting their meaning.
+	# This is safe becease we are going to use the values as-is.
+	@{$row}{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g;
 
 	# Select out just the rows for internal-language procedures.
 	# Note assumption here that INTERNALlanguageId is 12.
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index 702924a958..a4237b0661 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Andreas Karlsson

On 04/14/2017 11:54 PM, Tom Lane wrote:

I failed to resist the temptation to poke at this, and found that
indeed nothing seems to break if we just use one transaction for the
whole processing of postgres.bki.  So I've pushed a patch that does
that.  We're definitely down to the point where worrying about the
speed of bootstrap mode, per se, is useless; the other steps in
initdb visibly take a lot more time.


Looked some at this and what take time now for me seems to mainly be 
these four things (out of a total runtime of 560 ms).


1. setup_conversion:140 ms
2. select_default_timezone:  90 ms
3. bootstrap_template1:  80 ms
4. setup_schema: 65 ms

These four take up about two thirds of the total runtime, so it seems 
likely that we may still have relatively low hanging fruit (but not 
worth committing for PostgreSQL 10).


I have not done profiling of these functions yet, so am not sure how 
they best would be fixed but maybe setup_conversion could be converted 
into bki entries to speed it up.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OpenSSL 1.1 breaks configure and more

2017-04-16 Thread Andreas Karlsson

On 04/16/2017 03:14 AM, Tom Lane wrote:

1. Back-patch that patch, probably also including the followup adjustments
  in 86029b31e and 36a3be654.

2. Add #if's to use 31cf1a1a4's coding with OpenSSL >= 1.1, while keeping
   the older code for use when built against older OpenSSLs.

3. Conditionally disable renegotiation altogether with OpenSSL >= 1.1,
   thus adopting 9.5 not 9.4 behavior when using newer OpenSSL.

[...]

Thoughts?


Given that I cannot recall seeing any complaints about the behavior of 
9.4 compared to 9.3 I am leaning towards #1. That way there are fewer 
different versions of our OpenSSL code.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rename pg_log directory?

2017-03-10 Thread Andreas Karlsson

On 03/09/2017 11:25 PM, Bruce Momjian wrote:

"data" and "base" where chosen because it is a "data-base", but with the
pg_ prefixes it would be a pg_data_pg_base.  ;-)


Haha, I had not spotted that one despite always naming my data directory 
"data" while developing. Fun little tidbit there.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-08 Thread Andreas Karlsson

On 03/08/2017 03:48 AM, Robert Haas wrote:

On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andr...@proxel.se> wrote:

And I would argue that his feature is useful for quite many, based on my
experience running a semi-large database. Index bloat happens and without
REINDEX CONCURRENTLY it can be really annoying to solve, especially for
primary keys. Certainly more people have problems with index bloat than the
number of people who store index oids in their database.


Yeah, but that's not the only wart, I think.


The only two potential issues I see with the patch are:

1) That the index oid changes visibly to external users.

2) That the code for moving the dependencies will need to be updated 
when adding new things which refer to an index oid.


Given how useful I find REINDEX CONCURRENTLY I think these warts are 
worth it given that the impact is quite limited. I am of course biased 
since if I did not believe this I would not pursue this solution in the 
first place.



For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level.  If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector.


This version of the patch no longer does that. For my use case 
escalating the lock would make this patch much less interesting. The 
highest lock level taken is the same one as the initial one (SHARE 
UPDATE EXCLUSIVE). The current patch does on a high level (very 
simplified) this:


1. CREATE INDEX CONCURRENTLY ind_new;
2. Atomically move all dependencies from ind to ind_new, rename ind to 
ind_old, and rename ind_new to ind.

3. DROP INDEX CONCURRENTLY ind_old;

The actual implementation is a bit more complicated in reality, but no 
part escalates the lock level over what would be required by the steps 
for creating and dropping indexes concurrently



Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.


Hm, I cannot think of any real life scenario where this will be an issue 
based on my personal experience with PostgreSQL, but if you can think of 
one please provide it. I will try to ponder some more on this myself.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-08 Thread Andreas Karlsson

On 03/07/2017 09:56 PM, Sven R. Kunze wrote:

On 07.03.2017 03:21, Andreas Karlsson wrote:

1) I do not think we currently allow setting the locale like this
anywhere, so this will introduce a new concept to PostgreSQL. And you
will probably need to add support for caching per locale.


Good to know. Could you explain what you mean by "caching per locale"?


The current code for to_char will on the first call to to_char build 
arrays with the localized names of the week days and the months. I 
suspect that you may need to build something similar but a set of arrays 
per locale.


See the DCH_to_char function and its call to cache_locale_time.

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-12 Thread Andreas Karlsson

On 03/13/2017 03:11 AM, Andreas Karlsson wrote:

I also fixed the the code to properly support triggers.


And by "support triggers" I actually meant fixing the support for moving 
the foreign keys to the new index.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-12 Thread Andreas Karlsson

On 03/02/2017 03:10 AM, Michael Paquier wrote:

On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andr...@proxel.se> wrote:
+/*
+ * Copy contraint flags for old index. This is safe because the old index
+ * guaranteed uniquness.
+ */
+newIndexForm->indisprimary = oldIndexForm->indisprimary;
+oldIndexForm->indisprimary = false;
+newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+oldIndexForm->indisexclusion = false;
[...]
+deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+RelationRelationId, DEPENDENCY_AUTO);
+deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+ConstraintRelationId,
DEPENDENCY_INTERNAL);
+
+// TODO: pg_depend for old index?


Spotted one of my TODO comments there so I have attached a patch where I 
have cleaned up that function. I also fixed the the code to properly 
support triggers.



There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.


Hm, I am not sure how much that would help since a lot of the mumb-jumbo 
is by necessity in the step where we move the constraints over from the 
old index to the new.


Andreas
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 
  Acquired by VACUUM (without FULL),
- ANALYZE, CREATE INDEX CONCURRENTLY, and
+ ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
   convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  REINDEX will perform a concurrent build if 
+  CONCURRENTLY is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY
+  command. Indexes of toast relations can be rebuilt with REINDEX
+  CONCURRENTLY.
  
 
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 

+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+
+   
 VERBOSE
 
  
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block unti

Re: [HACKERS] \h tab-completion

2017-03-13 Thread Andreas Karlsson

On 03/13/2017 03:56 PM, David Steele wrote:

Do you know when you will have a new patch available for review that
incorporates Peter's request?


I believe I will find the time to finish it some time in a couple of days.

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Andreas Karlsson

On 03/01/2017 05:49 AM, Peter Eisentraut wrote:

On 2/27/17 09:51, Tom Lane wrote:

No objection to the basic point, but "log" seems perhaps a little too
generic to me.  Would something like "server_log" be better?


Well, "log" is pretty well established.  There is /var/log, and if you
unpack a, say, Kafka or Cassandra distribution, they also come with a
log or logs directory.


+1, though I am also fine with server_log.

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rename pg_log directory?

2017-03-06 Thread Andreas Karlsson

On 02/27/2017 03:05 PM, Peter Eisentraut wrote:

How about changing the default for log_directory from 'pg_log' to, say,
'log'?


I have attached a patch which does this. I do not care much about which 
name is picked as long as we get rid off the "pg_" prefix.


Btw, is there a reason for why global and base do not have the "pg_" prefix?

Andreas
commit 0b71fcdb328f05349775675e0491ba1b82127d4e
Author: Andreas Karlsson <andr...@proxel.se>
Date:   Mon Mar 6 23:52:49 2017 +0100

Rename default log directory from pg_log to log

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cd82c04b05..4ee1f605bc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4324,8 +4324,8 @@ SELECT * FROM parent WHERE key = 2400;
 find the logs currently in use by the instance. Here is an example of
 this file's content:
 
-stderr pg_log/postgresql.log
-csvlog pg_log/postgresql.csv
+stderr log/postgresql.log
+csvlog log/postgresql.csv
 
 
 current_logfiles is recreated when a new log file
@@ -4427,7 +4427,7 @@ local0.*/var/log/postgresql
 cluster data directory.
 This parameter can only be set in the postgresql.conf
 file or on the server command line.
-The default is pg_log.
+The default is log.

   
  
diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml
index 309a303e03..359f222352 100644
--- a/doc/src/sgml/file-fdw.sgml
+++ b/doc/src/sgml/file-fdw.sgml
@@ -262,7 +262,7 @@ CREATE FOREIGN TABLE pglog (
   location text,
   application_name text
 ) SERVER pglog
-OPTIONS ( filename '/home/josh/9.1/data/pg_log/pglog.csv', format 'csv' );
+OPTIONS ( filename '/home/josh/9.1/data/log/pglog.csv', format 'csv' );
 
   
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0707f66631..69b9cdacff 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3298,7 +3298,7 @@ static struct config_string ConfigureNamesString[] =
 			GUC_SUPERUSER_ONLY
 		},
 		_directory,
-		"pg_log",
+		"log",
 		check_canonical_path, NULL, NULL
 	},
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 157d775853..e6dbc31591 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -343,7 +343,7 @@
 	# (change requires restart)
 
 # These are only used if logging_collector is on:
-#log_directory = 'pg_log'		# directory where log files are written,
+#log_directory = 'log'			# directory where log files are written,
 	# can be absolute or relative to PGDATA
 #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'	# log file name pattern,
 	# can include strftime() escapes
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e5cb348f4c..1c87e39e0d 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -560,7 +560,7 @@ sub _backup_fs
 		$backup_path,
 		filterfn => sub {
 			my $src = shift;
-			return ($src ne 'pg_log' and $src ne 'postmaster.pid');
+			return ($src ne 'log' and $src ne 'postmaster.pid');
 		});
 
 	if ($hot)
diff --git a/src/test/perl/RecursiveCopy.pm b/src/test/perl/RecursiveCopy.pm
index 3e98813286..457488ba5b 100644
--- a/src/test/perl/RecursiveCopy.pm
+++ b/src/test/perl/RecursiveCopy.pm
@@ -48,9 +48,9 @@ attempted.
 
  RecursiveCopy::copypath('/some/path', '/empty/dir',
 filterfn => sub {
-		# omit pg_log and contents
+		# omit log and contents
 		my $src = shift;
-		return $src ne 'pg_log';
+		return $src ne 'log';
 	}
  );
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-06 Thread Andreas Karlsson

On 03/03/2017 10:41 PM, Sven R. Kunze wrote:

What do you think?


I have some thoughts:

1) I do not think we currently allow setting the locale like this 
anywhere, so this will introduce a new concept to PostgreSQL. And you 
will probably need to add support for caching per locale.


2) As far as I can tell from reading the code to_date currently ignores 
the M suffix which indicates that you want localized month/day names, so 
i guess that to_date is currently immutable. Maybe it is stable due to 
the idea that we may want to support the M suffix in the future.


3) I do not like the to_date function. It is much too forgiving with 
invalid input. For example 2017-02-30 because 2017-03-02. Also just 
ignoring the M suffix in the format string seems pretty bad


Personally I would rather see a new date parsing function which is 
easier to work with or somehow fix to_date without pissing too many 
users off, but I have no idea if this is a view shared with the rest of 
the community.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-05 Thread Andreas Karlsson

On 03/05/2017 07:56 PM, Robert Haas wrote:

On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund  wrote:

I agree that'd it be nicer not to have this, but not having the feature at all 
is a lot worse than this wart.


I, again, give that a firm "maybe".  If the warts end up annoying 1%
of the users who try to use this feature, then you're right.  If they
end up making a substantial percentage of people who try to use this
feature give up on it, then we've added a bunch of complexity and
future code maintenance for little real gain.  I'm not ruling out the
possibility that you're 100% correct, but I'm not nearly as convinced
of that as you are.


I agree that these warts are annoying but I also think the limitations
can be explained pretty easily to users (e.g. by explaining it in the 
manual how REINDEX CONCURRENTLY creates a new index to replace the old 
one), and I think that is the important question about if a useful 
feature with warts should be merged or not. Does it make things 
substantially harder for the average user to grok?


And I would argue that his feature is useful for quite many, based on my 
experience running a semi-large database. Index bloat happens and 
without REINDEX CONCURRENTLY it can be really annoying to solve, 
especially for primary keys. Certainly more people have problems with 
index bloat than the number of people who store index oids in their 
database.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-02 Thread Andreas Karlsson

On 03/02/2017 02:25 AM, Jim Nasby wrote:

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.


Another downside is any stored regclass fields will become invalid.
Admittedly that's a pretty unusual use case, but it'd be nice if there
was at least a way to let users fix things during the rename phase
(perhaps via an event trigger).


Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY 
issue event triggers seems strange to me. While it does create and drop 
indexes as part of its implementation, it is actually just an index 
maintenance job.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-14 Thread Andreas Karlsson

On 04/13/2017 06:13 PM, Tom Lane wrote:

I've pushed this with some mostly-cosmetic adjustments:


Thanks! I like your adjustments.


There's certainly lots more that could be done in the genbki code,
but I think all we can justify at this stage of the development
cycle is to get the low-hanging fruit for testing speedups.


Yeah, I also noticed that the genbki code seems to have gotten little 
love and that much more can be done here.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Self-signed certificate instructions

2017-04-17 Thread Andreas Karlsson

On 04/15/2017 03:58 PM, Andrew Dunstan wrote:

The instructions on how to create a self-signed certificate in s 18.9.3
of the docs seem unduly cumbersome.


+1, I see no reason for us to spread unnecessarily complicated instructions.

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG 10 release notes

2017-04-24 Thread Andreas Karlsson

On 04/25/2017 03:31 AM, Bruce Momjian wrote:

I have committed the first draft of the Postgres 10 release notes.  They
are current as of two days ago, and I will keep them current.  Please
give me any feedback you have.


This item is incorrectly attributed to me. I was only the reviewer, 
Peter is the author.


+
+
+
+Create a linkend="catalog-pg-sequence">pg_sequence system 
catalog to store sequence metadata (Andreas

+Karlsson)
+

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-06 Thread Andreas Karlsson

On 08/04/2017 08:48 PM, Shay Rojansky wrote:

On 2017-08-04 07:22:42 +0300, Shay Rojansky wrote:
> I'm still not convinced of the risk/problem of simply setting the session
> id context as I explained above (rather than disabling the optimization),
> but of course either solution resolves my problem.

How would that do anything? Each backend has it's own local
memory. I.e. any cache state that openssl would maintain wouldn't be
useful. If you want to take advantage of features around this you really
need to cache tickets in shared memory...

Guys, there's no data being cached at the backend - RFC5077 is about 
packaging information into a client-side opaque session ticket that 
allows skipping a roundtrip on the next connection. As I said, simply 
setting the session id context (*not* the session id or anything else) 
makes this feature work, even though a completely new backend process is 
launched.


Yes, session tickets are encrypted data which is stored by the client. 
But if we are going to support them I think we should do it properly 
with new GUCs for the key file and disabling the feature. Using a key 
file is less necessary for PostgreSQL than for a web server since it is 
less common to do round robin load balancing between different 
PostgreSQL instances.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE inlining

2017-05-03 Thread Andreas Karlsson

On 05/03/2017 07:33 PM, Alvaro Herrera wrote:

1) we switch unmarked CTEs as inlineable by default in pg11.  What seems
likely to happen for a user that upgrades to pg11 is that 5 out of 10
CTE-using queries are going to become faster than with pg10, and they
are going to be happy; 4 out of five are going to see no difference, but
they didn't have to do anything about it; and the remaining query is
going to become slower, either indistinguishably so (in which case they
don't care and they remain happy because of the other improvements) or
notably so, in which case they can easily figure where to add the
MATERIALIZED option and regain the original performance.


2) unmarked CTEs continue to be an optimization barrier, but we add
"WITH INLINED" so that they're inlineable.  Some users may wonder about
it and waste a lot of time trying to figure out which CTEs to add it to.
They see a benefit in half the queries, which makes them happy, but they
are angry that they had to waste all that time on the other queries.


3) We don't do anything, because we all agree that GUCs are not
suitable.  No progress.  No anger, but nobody is happy either.


+1 for option 1. And while I would not like if we had to combine it with 
a backwards compatibility GUC which enables the old behavior to get it 
merged I still personally would prefer that over option 2 and 3.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oidin / oidout and InvalidOid

2017-06-12 Thread Andreas Karlsson

On 06/12/2017 01:41 PM, Chapman Flack wrote:

I was recently guilty of writing a less-than-clear SQL example
because I plain forgot whether InvalidOid was 0 (correct) or -1
(my bad).

Would there be any sense in letting oidin_subr accept the string
InvalidOid for 0? I understand that changing oidout could break
existing code outside of the tree. But what if oidout were to be
conservative in what it does, and oidin liberal in what it accepts?


I am not sure I am a fan of this, but if we should have an alias for 
InvalidOid how about reusing '-' which is used by the reg*in functions? 
Or is that too non-obvious?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE inlining

2017-05-01 Thread Andreas Karlsson

On 05/01/2017 04:33 PM, David G. Johnston wrote:
> On Mon, May 1, 2017 at 7:26 AM, Andreas Karlsson <andr...@proxel.se
> I am not sure I like decorators since this means adding an ad hoc
> query hint directly into the SQL syntax which is something which I
> requires serious consideration.
>
> ​Given that we already have
> ​"​
> prevent optimization
> ​"​
> syntax why do we need a decorator on the CTE?

I do not think I follow. Me and some other people here would ideally 
allow CTEs to be inlined by default. Some people today use CTEs as 
optimization fences, to for example control join order, and the 
suggestion here is to add new syntax for CTEs to allow them to 
selectively be used as optimization fences.


> ​I would shorten that to "WITH MAT" except that I don't think that
> having two way to introduce an optimization fence is worthwhile.

You mean OFFSET 0? I have never been a fan of using it as an 
optimization fence. I do not think OFFSET 0 conveys clearly enough to 
the reader that is is an optimization fence.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE inlining

2017-05-01 Thread Andreas Karlsson

On 05/01/2017 04:17 PM, David Fetter wrote:

Maybe we could allow a "decorator" that would tell the planner the CTE
could be inlined?

WITH INLINE mycte AS ( ...)


+1 for a decorator, -1 for this one.


I am not sure I like decorators since this means adding an ad hoc query 
hint directly into the SQL syntax which is something which I requires 
serious consideration.



We already have an explicit optimization fence with OFFSET 0, and I
think making optimization fences explicit is how we should continue.
I'd be more in favor of something along the lines of

WITH FENCED/* Somewhat fuzzy.  What fence? */
or
WITH AT_MOST_ONCE  /* Clearer, but not super precise */
or
WITH UNIQUE_ATOMIC /* More descriptive, but not super clear without the 
docs in hand */

or something along that line.


What about WITH MATERIALIZED, borrowing from the MySQL terminology 
"materialized subquery"?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE inlining

2017-05-02 Thread Andreas Karlsson

On 05/02/2017 04:38 AM, Craig Ringer wrote:

On 1 May 2017 at 22:26, Andreas Karlsson <andr...@proxel.se> wrote:

I am not sure I like decorators since this means adding an ad hoc query hint
directly into the SQL syntax which is something which I requires serious
consideration.


And mangling the semantics of existing syntax doesn't?

That's what we do right now so we can pretend we don't have query
hints while still having query hints.


I am in favor of removing the optimization fence from CTEs, and strongly 
prefer no fence being the default behavior since SQL is a declarative 
language and I think it is reasonable to assume that CTEs can be 
inlined. But the question is how to best remove the fence while taking 
into account that quite many use them as optimization fences today.


I see some alternatives, none of them perfect.

1. Just remove the optimization fence and let people add OFFSET 0 to 
their queries if they want an optimization fence. This lets us keep 
pretending that we do not have query hints (and therefore do not have to 
formalize any syntax for them) while still allowing people to add 
optimization fences.


2. Add a decorator for WITH (e.g. "WITH MATERIALIZED x (...") to add an 
explicit optimization fence. This will for the first time add official 
support for a query hint in the syntax which is a quite big precedent.


3. Add a new GUC which can enable and disable the optimization fence. 
This is a very clumsy tool, but maybe good enough for some users and 
some people here in this thread have complained about our similar GUCs.


4. Add some new more generic query hinting facility. This is a lot of 
work and something which would be very hard to get consensus for.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE inlining

2017-05-04 Thread Andreas Karlsson

On 05/04/2017 06:22 PM, Andrew Dunstan wrote:

I wrote this query:

select (json_populate_record(null::mytype, myjson)).*
from mytable;


It turned out that this was an order of magnitude faster:

with r as
(
   select json_populate_record(null::mytype, myjson) as x
   from mytable
)
select (x).*
from r;


I do not know the planner that well, but I imagined that when we remove 
the optimization fence that one would be evaluated similar to if it had 
been a lateral join, i.e. there would be no extra function calls in this 
case after removing the fence.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-09-16 Thread Andreas Karlsson
On 09/14/2017 08:33 PM, Jeff Janes wrote:> Attached is a new patch which 
fixes the style issue you mentioned.


Thanks, the patch looks good no,w and as far as I can tell there was no 
need to update the comments or the documentation so I am setting this as 
ready for committer.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-19 Thread Andreas Karlsson

On 09/19/2017 11:32 PM, Peter Geoghegan wrote:

On Tue, Sep 19, 2017 at 2:22 PM, Tom Lane  wrote:

Well, if PG10 shipped with that restriction in place then it wouldn't
be an issue ;-)


I was proposing that this be treated as an open item for v10; sorry if
I was unclear on that. Much like the "ICU locales vs. ICU collations
within pg_collation" issue, this seems like the kind of thing that we
ought to go out of our way to get right in the *first* version.


If people think it is possible to get this done in time for PostgreSQL 
10 and it does not break anything on older version of ICU (or the 
migration from older versions) I am all for it.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-19 Thread Andreas Karlsson
On 09/19/2017 12:46 AM, Peter Geoghegan wrote:> At one point a couple of 
months back, it was understood that

get_icu_language_tag() might not always work with (assumed) valid
locale names -- that is at least the impression that the commit
message of eccead9 left me with. But, that was only with ICU 4.2, and
in any case we've since stopped creating keyword variants at initdb
time for other reasons (see 2bfd1b1 for details of those other
reasons). I tend to think that we should not install any language tag
that uloc_toLanguageTag() does not accept as valid on general
principle (so not just at initdb time, when it's actually least
needed).

Thoughts? I can write a patch for this, if that helps. It should be
straightforward.


Hm, I like the idea but I see some issues.

Enforcing the BCP47 seems like a good thing to me. I do not see any 
reason to allow input with syntax errors. The issue though is that we do 
not want to break people's databases when they upgrade to PostgreSQL 11. 
What if they have specified the locale in the old non-ICU format or they 
have a bogus value and we then error out on pg_upgrade or pg_restore?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-22 Thread Andreas Karlsson

On 09/21/2017 10:55 PM, Peter Geoghegan wrote:

I agree, but the bigger issue is that we're *half way* between
supporting only one format, and supporting two formats. AFAICT, there
is no reason that we can't simply support one format on all ICU
versions, and keep what ends up within pg_collation at initdb time
essentially the same across ICU versions (except for those that are
due to cultural/political developments).


I think we are in agreement then, but I do not have the time to get this 
done before the release of 10 so would be happy if you implemented it. 
Peter E, what do you say in this?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-09-17 Thread Andreas Karlsson

On 09/15/2017 06:55 PM, Jeff Janes wrote:

I can't build against gnutls-2.12.23-21.el6.x86_64 from CentOS 6.9


Thanks for testing my patch. I have fixed both these issues plus some of 
the other feedback. A new version of my patch is attached which should, 
at least on theory, support all GnuTLS versions >= 2.11.


I just very quickly fixed the broken SSL tests, as I am no fan of how 
the SSL tests currently are written and think they should be cleaned up.


Andreas
diff --git a/configure b/configure
index 0d76e5ea42..33b1f00bff 100755
--- a/configure
+++ b/configure
@@ -709,6 +709,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_gnutls
 with_openssl
 krb_srvtab
 with_python
@@ -838,6 +839,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_gnutls
 with_selinux
 with_systemd
 with_readline
@@ -1534,6 +1536,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-gnutls   build with GnuTS support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -6051,6 +6054,41 @@ fi
 $as_echo "$with_openssl" >&6; }
 
 
+#
+# GnuTLS
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5
+$as_echo_n "checking whether to build with GnuTLS support... " >&6; }
+
+
+
+# Check whether --with-gnutls was given.
+if test "${with_gnutls+set}" = set; then :
+  withval=$with_gnutls;
+  case $withval in
+yes)
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_gnutls=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5
+$as_echo "$with_gnutls" >&6; }
+
+
 #
 # SELinux
 #
@@ -10218,6 +10256,83 @@ done
 
 fi
 
+if test "$with_gnutls" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char gnutls_init ();
+int
+main ()
+{
+return gnutls_init ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' gnutls; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_gnutls_init=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext
+  if ${ac_cv_search_gnutls_init+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search_gnutls_init+:} false; then :
+
+else
+  ac_cv_search_gnutls_init=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5
+$as_echo "$ac_cv_search_gnutls_init" >&6; }
+ac_res=$ac_cv_search_gnutls_init
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+else
+  as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5
+fi
+
+  # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted
+  # certificate chains.
+  ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include 
+#include 
+
+"
+if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl
+_ACEOF
+
+fi
+
 if test "$with_pam" = yes ; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5
 $as_echo_n "checking for pam_start in -lpam... " >&6; }
@@ -11015,6 +11130,17 @@ else
 fi
 
 
+fi
+
+if test "$with_gnutls" = yes ; then
+  ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default"
+if test "x$ac_cv_header_gnutls_gnutls_h" = xyes; then :
+
+else
+  as_fn_error $? "header file  is required for GnuTLS" "$LINENO" 5
+fi
+
+
 fi
 
 if test "$with_pam" = yes ; then
@@ -15522,9 +15648,11 @@ fi
 # in the template or configure command line.
 
 # If not selected manually, try to select a source automatically.
-if test "$enable_strong_random" = "yes" && test x"$USE_OPENSSL_RANDOM" = x"" && test x"$USE_WIN32_RANDOM" = x"" && test x"$USE_DEV_URANDOM" = x"" ; then
+if test "$enable_strong_random" = "yes" && test 

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-09-17 Thread Andreas Karlsson
I have not looked at the issue with the btree_gin tests yet, but here is 
the first part of my review.


= Review

This is my first quick review where I just read the documentation and 
quickly tested the feature. I will review it more in-depth later.


This is a very useful feature, one which I have a long time wished for.

The patch applies, compiles and passes the test suite with just one warning.

parse_coerce.c: In function ‘select_common_type_2args’:
parse_coerce.c:1379:7: warning: statement with no effect [-Wunused-value]
   rightOid;
   ^~~~

= Functional

The documentation does not agree with the code on the syntax. The 
documentation claims it is "FOREIGN KEY (ELEMENT xs) REFERENCES t1 (x)" 
when it actually is "FOREIGN KEY (EACH ELEMENT OF xs) REFERENCES t1 (x)".


Likewise I can't get the "final_positions integer[] ELEMENT REFERENCES 
drivers" syntax to work, but here I cannot see any change in the syntax 
to support it.


Related to the above: I am not sure if it is a good idea to make ELEMENT 
a reserved word in column definitions. What if the SQL standard wants to 
use it for something?


The documentation claims ON CASCADE DELETE is not supported by array 
element foreign keys, but I do not think that is actually the case.


I think I prefer (EACH ELEMENT OF xs) over (ELEMENT xs) given how the 
former is more in what I feel is the spirit of SQL. And if so we should 
match it as "xs integer[] EACH ELEMENT REFERENCES t1 (x)", assuming we 
want that syntax.


Once I have created an array element foreign key the basic features seem 
to work as expected.


The error message below fails to mention that it is an array element 
foreign key, but I do not think that is not a blocker for getting this 
feature merged. Right now I cannot think of how to improve it either.


$ INSERT INTO t3 VALUES ('{1,3}');
ERROR:  insert or update on table "t3" violates foreign key constraint 
"t3_xs_fkey"

DETAIL:  Key (xs)=({1,3}) is not present in table "t1".

= Nitpicking/style comments

In doc/src/sgml/catalogs.sgml the 
"conpfeqop" line is 
incorrectly indented.


I am not fan of calling it "array-vs-scalar". What about array to scalar?

In ddl.sgml date should be lower case like the other types  in "race_day 
DATE,".


In ddl.sgml I suggest removing the "..." from the examples to make it 
possible to copy paste them easily.


Your text wrapping in ddl.sqml and create_table.sgqml is quite 
arbitrary. I suggest wrapping all paragraphs at 80 characters (except 
for code which should not be wrapped). Your text editor probably has 
tools for wrapping paragraphs.


Please be consistent about how you write table names and SQL in general. 
I think almost all places use lower case for table names, while your 
examples in create_table.sgml are FKTABLEFORARRAY.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

2017-09-21 Thread Andreas Karlsson

On 09/21/2017 01:40 AM, Peter Geoghegan wrote:

On Wed, Sep 20, 2017 at 4:08 PM, Peter Geoghegan  wrote:

pg_import_system_collations() takes care to use the non-BCP-47 style for
such versions, so I think this is working correctly.


But CREATE COLLATION doesn't use pg_import_system_collations().


And perhaps more to the point: it highly confusing that we use one or
the other of those 2 things ("langtag"/BCP 47 tag or "name"/legacy
locale name) as "colcollate", depending on ICU version, thereby
*behaving* as if ICU < 54 really didn't know anything about BCP 47
tags. Because, obviously earlier ICU versions know plenty about BCP
47, since 9 lines further down we use "langtag"/BCP 47 tag as collname
for CollationCreate() -- regardless of ICU version.

How can you say "ICU <54 doesn't even support the BCP 47 style", given
all that? Those versions will still have locales named "*-x-icu" when
users do "\dOS". Users will be highly confused when they quite
reasonably try to generalize from the example in the docs and what
"\dOS" shows, and get results that are wrong, often only in a very
subtle way.


If we are fine with supporting only ICU 4.2 and later (which I think we 
are given that ICU 4.2 was released in 2009) then using 
uloc_forLanguageTag()[1] to validate and canonize seems like the right 
solution. I had missed that this function even existed when I last read 
the documentation. Does it return a BCP 47 tag in modern versions of ICU?


I strongly prefer if there, as much as possible, is only one format for 
inputting ICU locales.


1. 
http://www.icu-project.org/apiref/icu4c/uloc_8h.html#aa45d6457f72867880f079e27a63c6fcb


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-09-07 Thread Andreas Karlsson

On 09/07/2017 11:34 PM, Tomas Vondra wrote:

I am worried about having 3x version of TLS controls in
postgresql.conf, and only one set being active. Perhaps we need to
break out the TLS config to separate files or something. Anyway, this
needs more thought.


Well, people won't be able to set the inactive options, just like you
can't set ssl=on when you build without OpenSSL support. But perhaps we
could simply not include the inactive options into the config file, no?


Yeah, I have been thinking about how bad it would be to dynamically 
generate the config file. I think I will try this.


Daniel: What options does Secure Transport need for configuring ciphers, 
ECDH, and cipher preference? Does it need any extra options (I think I 
saw something about the keychain)?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-09-12 Thread Andreas Karlsson
On 07/27/2017 09:45 PM, Jeff Janes wrote:> Here is an updated patch.  
This version allows you use the password-less
connection if you either are the super-user directly (which is the 
existing committed behavior), or if you are using the super-user's 
mapping because you are querying a super-user-owned view which you have 
been granted access to.


I have tested the patch and it passes the tests and works, and the code 
looks good (I have a small nitpick below).


The feature seems useful, especially for people who already use views 
for security, so the question is if this is a potential footgun. I am 
leaning towards no since the superuser should be careful when grant 
access to is views anyway.


It would have been nice if there was a more generic way to handle this 
since 1) the security issue is not unique to postgres_fdw and 2) this 
requires you to create a view. But since the patch is simple, an 
improvement in itself and does not prevent any future further 
improvements in this era I see no reason to let perfect be the enemy of 
good.


= Nitpicking/style

I would prefer if

/* no check required if superuser */
if (superuser())
return;

if (superuser_arg(user->userid))
return;

was, for consistency with the if clause in connect_pg_server(), written as

/* no check required if superuser */
if (superuser() || superuser_arg(user->userid))
return;

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generated columns

2017-09-13 Thread Andreas Karlsson

On 09/13/2017 04:04 AM, Simon Riggs wrote:

On 31 August 2017 at 05:16, Peter Eisentraut
 wrote:

- index support (and related constraint support)


Presumably you can't index a VIRTUAL column. Or at least I don't think
its worth spending time trying to make it work.


I think end users would be surprised if one can index STORED columns and 
expressions but not VIRTUAL columns. So unless it is a huge project I 
would say it is worth it.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patches that don't apply or don't compile: 2017-09-12

2017-09-12 Thread Andreas Karlsson

On 09/12/2017 04:14 PM, Aleksander Alekseev wrote:

Title: Foreign Key Arrays
Author: Mark Rofail
URL:https://commitfest.postgresql.org/14/1252/


I am currently reviewing this one and it applies, compiles, and passes 
the test suite. It could be the compilation warnings which makes the 
system think it failed, but I could not find the log of the failed build.


We want to be welcoming to new contributors so until we have a reliable 
CI server which can provide easy to read build logs I am against 
changing the status of any patches solely based on the result of the CI 
server. I think it should be used as a complimentary tool until the 
community deems it to be good enough.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-08-31 Thread Andreas Karlsson
I have attached a new, rebased version of the batch with most of Banck's 
and some of your feedback incorporated. Thanks for the good feedback!


On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX 
SCHEMA CONCURRENTLY public on the regression

database I am bumping into a bunch of these warnings:
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123


I failed to reproduce this. Do you have a reproducible test case?


+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+   tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.


Sorry, but I am not sure I understand your suggestion. I do not like the 
ResetTupleDescCache function so all suggestions are welcome.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] name
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?


I have thought some about this myself and I do not care strongly either way.


- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.


I am no fan of the current code duplication and how fragile it is, but I 
think these cases are sufficiently different to prevent meaningful code 
reuse. But it could just be me who is unfamiliar with that part of the code.



- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.


I am still leaning towards my current tradeoff since waiting for all 
queries to stop using an index can take a lot of time and if you only 
have to do that once per table it would be a huge benefit under some 
workloads, and you can still reindex each index separately if you need to.


Andreas

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index dda0170886..c97944b2c9 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,7 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
  Acquired by VACUUM (without FULL),
  ANALYZE, CREATE INDEX CONCURRENTLY,
- CREATE STATISTICS and
+ REINDEX CONCURRENTLY, CREATE STATISTICS and
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..4ef3a89a29 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
   An index build with the CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
-  convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  convenient to use REINDEX to rebuild them.
  
 
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 

 
+   
+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+

 VERBOSE
 
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+

[HACKERS] GnuTLS support

2017-08-31 Thread Andreas Karlsson

Hi,

I have seen discussions from time to time about OpenSSL and its 
licensing issues so I decided to see how much work it would be to add 
support for another TLS library, and  I went with GnuTLS since it is the 
library I know best after OpenSSL and it is also a reasonably popular 
library.


Attached is a work in progress patch which implements the basics. I send 
it the list because I want feedback on some design questions and to 
check with the community if this is a feature we want.


= What is implemented

- Backend
- Frontend
- Diffie-Hellmann support
- Using GnuTLS for secure random numbers
- Using GnuTLS for sha2

= Work left to do

- Add GnuTLS support to sslinfo
- Add GnuTLS support to pgcrypto
- Support for GnuTLS's version of engines
- Test code with older versions of GnuTLS
- Update documentation
- Add support for all postgresql.conf options (see design question)
- Fix two failing tests (see design question)

= Design questions

== GnuTLS priority strings vs OpenSSL cipher lists

GnuTLS uses a different format for specifying ciphers. Instead of 
setting ciphers, protocol versions, and ECDH curves separately GnuTLS 
instead uses a single priority string[1].


For example the default settings of PostgreSQL (which include disabling 
SSLv3)


ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
ssl_prefer_server_ciphers = on
ssl_ecdh_curve = 'prime256v1'

is represented with a string similar to

SECURE128:+3DES-CBC:+GROUP-SECP256R1:%SERVER_PRECEDENCE

So the two libraries have decided on different ways to specify things.

One way to solve th issue would be to just let ssl_ciphers be the 
priority string and then add %SERVER_PRECEDENCE to it if 
ssl_prefer_server_ciphers is on. Adding the ssl_ecdh_curve is trickier 
since the curves have different names in GnuTLS (e.g. prime256v1 vs 
SECP256R1) and I would rather avoid having to add such a mapping to 
PostgreSQL. Thoughts?


== Potentially OpenSSL-specific  est cases

There are currently two failing SSL tests which at least to me seems 
more like they test specific OpenSSL behaviors rather than something 
which need to be true for all SSL libraries.


The two tests:

# Try with just the server CA's cert. This fails because the root file
# must contain the whole chain up to the root CA.
note "connect with server CA cert, without root CA";
test_connect_fails("sslrootcert=ssl/server_ca.crt sslmode=verify-ca");

# A CRL belonging to a different CA is not accepted, fails
test_connect_fails(
"sslrootcert=ssl/root+server_ca.crt sslmode=verify-ca 
sslcrl=ssl/client.crl");


For the missing root CA case GnuTLS seems to be happy enough with just 
an intermediate CA and as far as I can tell this behavior is not easy to 
configure.


And for the CRL belonging to a different CA case GnuTLS can be 
configured to either just store such a non-validating CRL or to ignore 
it, but not to return an error.


Personally I think thee two tests should just be removed but maybe I am 
missing something.


Notes:

1. https://gnutls.org/manual/html_node/Priority-Strings.html

Andreas
diff --git a/configure b/configure
index a2f9a256b4..8dcb26b532 100755
--- a/configure
+++ b/configure
@@ -709,6 +709,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_gnutls
 with_openssl
 krb_srvtab
 with_python
@@ -838,6 +839,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_gnutls
 with_selinux
 with_systemd
 with_readline
@@ -1534,6 +1536,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-gnutls   build with GnuTS support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -6051,6 +6054,41 @@ fi
 $as_echo "$with_openssl" >&6; }
 
 
+#
+# GnuTLS
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5
+$as_echo_n "checking whether to build with GnuTLS support... " >&6; }
+
+
+
+# Check whether --with-gnutls was given.
+if test "${with_gnutls+set}" = set; then :
+  withval=$with_gnutls;
+  case $withval in
+yes)
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_gnutls=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5
+$as_echo "$with_gnutls" >&6; }
+
+
 #
 # SELinux
 #
@@ -10218,6 +10256,67 @@ done
 
 fi
 
+if test "$with_gnutls" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - 

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-11-12 Thread Andreas Karlsson

On 11/10/2017 01:47 AM, Mark Rofail wrote:

I am sorry for the late reply


There is no reason for you to be. It did not take you 6 weeks to do a 
review. :) Thanks for this new version.



== Functional review

 >1) MATCH FULL does not seem to care about NULLS in arrays. In the
example below I expected both inserts into the referring table to fail.


It seems in your example the only failed case was: INSERT INTO fk VALUES 
(NULL, '{1}');

which shouldn't work, can you clarify this?


I think that if you use MATH FULL the query should fail if you have a 
NULL in the array.



 >2) To me it was not obvious that ON DELETE CASCADE would delete
the whole rows rather than delete the members from the array, and
this kind of misunderstanding can lead to pretty bad surprises in
production. I am leaning towards not supporting CASCADE.


I would say so too, maybe we should remove ON DELETE CASCADE until we 
have supported all remaining actions.


I am leaning towards this too. I would personally be fine with a first 
version without support for CASCADE since it is not obvious to me what 
CASCADE should do.



== The @>> operator
I would argue that allocating an array of datums and building an array 
would have the same complexity


I am not sure what you mean here. Just because something has the same 
complexity does not mean there can't be major performance differences.



== Code review

 >I think the code in RI_Initial_Check() would be cleaner if you
used "CROSS JOIN LATERAL unnest(col)" rather than having unnest() in
the target list. This way you would not need to rename all columns
and the code paths for the array case could look more like the code
path for the normal case.

Can you clarify what you mean a bit more?


I think the code would look cleaner if you generate the following query:

SELECT fk.x, fk.ys FROM ONLY t2 fk CROSS JOIN LATERAL 
pg_catalog.unnest(ys) a2 (v) LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.x 
AND pk.y = a2.v WHERE [...]


rather than:

SELECT fk.k1, fk.ak2 FROM (SELECT x k1, pg_catalog.unnest(ys) k2, ys ak2 
FROM ONLY t2) fk LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.k1 AND pk.y = 
fk.k2 WHERE [...]


= New stuff

When applying the patch I got some white space warnings:

Array-ELEMENT-foreign-key-v5.3.patch:1343: space before tab in indent, 
indent with spaces.

format_type_be(oprleft), 
format_type_be(oprright;
Array-ELEMENT-foreign-key-v5.3.patch:1345: trailing whitespace.

When compiling I got an error:

ri_triggers.c: In function ‘ri_GenerateQual’:
ri_triggers.c:2693:19: error: unknown type name ‘d’
   Oid   oprcommon;d
   ^
ri_triggers.c:2700:3: error: conflicting types for ‘oprright’
   oprright = get_array_type(operform->oprleft);
   ^~~~
ri_triggers.c:2691:9: note: previous declaration of ‘oprright’ was here
   Oid   oprright;
 ^~~~
: recipe for target 'ri_triggers.o' failed

When building the documentation I got two warnings:

/usr/bin/osx:catalogs.sgml:2349:17:W: empty end-tag
/usr/bin/osx:catalogs.sgml:2350:17:W: empty end-tag

When running the tests I got a failure in element_foreign_key.

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-11-02 Thread Andreas Karlsson
On 09/18/2017 07:04 PM, Jeff Janes wrote:> You fixed the first issue, 
but I still get the second one:


be-secure-gnutls.c: In function 'get_peer_certificate':
be-secure-gnutls.c:667: error: 'GNUTLS_X509_CRT_LIST_SORT' undeclared 
(first use in this function)
be-secure-gnutls.c:667: error: (Each undeclared identifier is reported 
only once

be-secure-gnutls.c:667: error: for each function it appears in.)


Thanks again for testing the code. I have now rebased the patch and 
fixed the second issue. I tested that it works on CentOS 6.


Work which remains:

- sslinfo
- pgcrypto
- Documentation
- Decide if what I did with the config is a good idea

Andreas
diff --git a/configure b/configure
index 4ecd2e1922..1ba34dfced 100755
--- a/configure
+++ b/configure
@@ -709,6 +709,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_gnutls
 with_openssl
 krb_srvtab
 with_python
@@ -837,6 +838,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_gnutls
 with_selinux
 with_systemd
 with_readline
@@ -1531,6 +1533,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-gnutls   build with GnuTS support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -5997,6 +6000,41 @@ fi
 $as_echo "$with_openssl" >&6; }
 
 
+#
+# GnuTLS
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5
+$as_echo_n "checking whether to build with GnuTLS support... " >&6; }
+
+
+
+# Check whether --with-gnutls was given.
+if test "${with_gnutls+set}" = set; then :
+  withval=$with_gnutls;
+  case $withval in
+yes)
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_gnutls=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5
+$as_echo "$with_gnutls" >&6; }
+
+
 #
 # SELinux
 #
@@ -10164,6 +10202,94 @@ done
 
 fi
 
+if test "$with_gnutls" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char gnutls_init ();
+int
+main ()
+{
+return gnutls_init ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' gnutls; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_gnutls_init=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext
+  if ${ac_cv_search_gnutls_init+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search_gnutls_init+:} false; then :
+
+else
+  ac_cv_search_gnutls_init=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5
+$as_echo "$ac_cv_search_gnutls_init" >&6; }
+ac_res=$ac_cv_search_gnutls_init
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+else
+  as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5
+fi
+
+  # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted
+  # certificate chains.
+  ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include 
+#include 
+
+"
+if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl
+_ACEOF
+
+  for ac_func in gnutls_pkcs11_set_pin_function
+do :
+  ac_fn_c_check_func "$LINENO" "gnutls_pkcs11_set_pin_function" "ac_cv_func_gnutls_pkcs11_set_pin_function"
+if test "x$ac_cv_func_gnutls_pkcs11_set_pin_function" = xyes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_GNUTLS_PKCS11_SET_PIN_FUNCTION 1
+_ACEOF
+
+fi
+done
+
+fi
+
 if test "$with_pam" = yes ; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5
 $as_echo_n "checking for pam_start in -lpam... " >&6; }
@@ -10961,6 +11087,17 @@ else
 fi
 
 
+fi
+
+if test "$with_gnutls" = yes ; then
+  ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default"
+if test 

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-10-29 Thread Andreas Karlsson

Sorry for the very late review.

I like this feature and have needed it myself in the past, and the 
current syntax seems pretty good. One could argue for if the syntax 
could be generalized to support other things like json and hstore, but I 
do not think it would be fair to block this patch due to that.


== Limitations of the current design

1) Array element foreign keys can only be specified at the table level 
(not at columns): I think this limitation is fine. Other PostgreSQL 
specific features like exclusion contraints can also only be specified 
at the table level.


2) Lack of support for SET NULL and SET DEFAULT: these do not seem very 
useful for arrays.


3) Lack of support for specifiying multiple arrays in the foreign key: 
seems like a good thing to me since it is not obvious what such a thing 
even would do.


4) That you need to add a cast to the index if you have different types: 
due to there not being a int4[] <@ int2[] operator you need to add an 
index on (col::int4[]) to speed up deletes and updates. This one i 
annoying since EXPLAIN wont give you the query plans for the foreign key 
queries, but I do not think fixing this should be within the scope of 
the patch and that having a smaller interger in the referring table is rare.


5) The use of count(DISTINCT) requiring types to support btree equality: 
this has been discussed a lot up-thread and I think the current state is 
good enough.


== Functional review

I have played around some with it and things seem to work and the test 
suite passes, but I noticed a couple of strange behaviors.


1) MATCH FULL does not seem to care about NULLS in arrays. In the 
example below I expected both inserts into the referring table to fail.


CREATE TABLE t (x int, y int, PRIMARY KEY (x, y));
CREATE TABLE fk (x int, ys int[], FOREIGN KEY (x, EACH ELEMENT OF ys) 
REFERENCES t MATCH FULL);

INSERT INTO t VALUES (10, 1);
INSERT INTO fk VALUES (10, '{1,NULL}');
INSERT INTO fk VALUES (NULL, '{1}');

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
ERROR:  insert or update on table "fk" violates foreign key constraint 
"fk_x_fkey"

DETAIL:  MATCH FULL does not allow mixing of null and nonnull key values.

2) To me it was not obvious that ON DELETE CASCADE would delete the 
whole rows rather than delete the members from the array, and this kind 
of misunderstanding can lead to pretty bad surprises in production. I am 
leaning towards not supporting CASCADE.


== The @>> operator

A previous version of your patch added the "anyelement <<@ anyarray" 
operator to avoid having to build arrays, but that part was reverted due 
to a bug.


I am not expert on the gin code, but as far as I can tell it would be 
relatively simple to fix that bug. Just allocate an array of Datums of 
length one where you put the element you are searching for (or maybe a 
copy of it).


Potential issues with adding the operators:

1) Do we really want to add an operator just for array element foreign 
keys? I think this is not an issue since it seems like it should be 
useful in general. I know I have wanted it myself.


2) I am not sure, but the committers might prefer if adding the 
operators is done in a separate patch.


3) Bikeshedding about operator names. I personally think @>> is clear 
enough and as far as I know it is not used for anything else.


== Code review

The patch no longer applies to HEAD, but the conflicts are small.

I think we should be more consistent in the naming, both in code and in 
the documentation. Right now we have "array foreign keys", "element 
foreign keys", "ELEMENT foreign keys", etc.


+   /*
+* If this is an array foreign key, we must look up the 
operators for

+* the array element type, not the array type itself.
+*/
+   if (fkreftypes[i] != FKCONSTR_REF_PLAIN)

+   if (fkreftypes[i] != FKCONSTR_REF_PLAIN)
+   {
+   old_fktype = get_base_element_type(old_fktype);
+   /* this shouldn't happen ... */
+   if (!OidIsValid(old_fktype))
+   elog(ERROR, "old foreign key column is not an array");
+   }

+   if (riinfo->fk_reftypes[i] != FKCONSTR_REF_PLAIN)
+   {
+   riinfo->has_array = true;
+   riinfo->ff_eq_oprs[i] = ARRAY_EQ_OP;
+   }

In the three diffs above it would be much cleaner to check for "== 
FKCONSTR_REF_EACH_ELEMENT" since that better conveys the intent and is 
safer for adding new types in the future.


+   /* We look through any domain here */
+   fktype = get_base_element_type(fktype);

What does the comment above mean?

if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop)))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
-errmsg("foreign key constraint \"%s\" "
-   "cannot be implemented",
-   fkconstraint->conname),
-errdetail("Key columns \"%s\" and 

Re: [HACKERS] git down

2017-10-29 Thread Andreas Karlsson

On 10/27/2017 10:51 PM, Erik Rijkers wrote:

git.postgresql.org is down/unreachable

( git://git.postgresql.org/git/postgresql.git )


Yes, I noticed this too, but 
https://git.postgresql.org/git/postgresql.git still works fine.


I guess it makes sense to remove unencrypted access, but in that case 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary should not 
advertise supporting the git protocol. I have not seen any announcement 
either, but that could just be me not paying enough attention.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-10-31 Thread Andreas Karlsson

Here is a rebased version of the patch.

Andreas
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0ca2851e5..f8c59ea127 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
  Acquired by VACUUM (without FULL),
  ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  CREATE STATISTICS and
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 2e053c4c24..4019bad4c2 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
   An index build with the CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
-  convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  convenient to use REINDEX to rebuild them.
  
 
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 

 
+   
+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+

 VERBOSE
 
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is finished. This could have a severe effect if the system is
+a live production database. Very large tables can take many hours to be
+indexed, and even for smaller tables, an index rebuild can lock out writers
+for periods that are unacceptably long for a production system.
+   
+
+   
+PostgreSQL supports rebuilding indexes with minimum locking
+of writes.  This method is invoked by specifying the
+CONCURRENTLY option of REINDEX. When this option
+is used, PostgreSQL must perform two scans of the table
+for each index that needs to be rebuild and in addition it must wait for
+all existing transactions that could potentially use the index to
+terminate. This method requires more total work than a standard index
+rebuild and takes significantly longer to complete as it needs to wait
+for unfinished transactions that might modify the index. However, since
+it allows normal operations to continue while the index is rebuilt, this
+method is useful for rebuilding indexes in a production environment. Of
+course, the extra CPU, memory and I/O load imposed by the index rebuild
+may slow down other operations.
+   
+
+   
+The following steps occur in a concurrent index build, each in a separate
+transaction except when the new index definitions are created, where all
+the concurrent entries are created using only one transaction. Note that
+if there are multiple indexes to be rebuilt then each step loops through
+all the indexes we're rebuilding, using a separate transaction for each one.
+REINDEX CONCURRENTLY proceeds as follows when rebuilding
+indexes:
+
+
+ 
+  
+   A new temporary index definition is added into the catalog
+   pg_index. This definition will be used to replace the
+   old index. This step is done as a single transaction for all the indexes
+   involved in this process, meaning that if
+   REINDEX CONCURRENTLY is run on a table with multiple
+   indexes, all the catalog entries of the new indexes are created within a
+   single transaction. A SHARE UPDATE EXCLUSIVE lock at
+   session level is taken on the indexes being reindexed as well as its
+   parent table to prevent any schema modification while processing.
+  
+ 
+ 
+  
+   A first 

<    1   2   3   4