Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-31 1:16 GMT+01:00 Craig Ringer :

> On 30 December 2016 at 21:00, Fabien COELHO  wrote:
>
> > As for "slow", I have just tested overheads with pgbench, comparing a
> direct
> > arithmetic operation (as a proxy to a fast session variable
> consultation) to
> > constant returning plpgsql functions with security definer and security
> > invoker, on a direct socket connection, with prepared statements:
> >
> >   select 1 + 0: 0.020 ms
> >   select one_sd() : 0.024 ms
> >   select one_si() : 0.024 ms
>
> That's one call per executor run. Not really an effective test.
>
> Consider cases like row security where you're testing 1 rows.
> Hopefully the planner will inline the test if it's a function declared
> stable, but it may not.
>
>
> > However the one-row property is just hoped for, and on principle a
> database
> > is about declaring constraints that are enforced afterwards.
> >
> > I see two clean solutions to this use case: declaring tables as one row,
> or
> > having scalar objects.
>
>
> I agree that's a common issue.
>
> The unique partial index on 1 hack in postgres works, though it's ugly.
>
> Adding a whole new different storage concept seems like massive
> overkill for this problem, which is minor and already easily solved.
> Someone could make 1-row tables prettier with a new constraint type
> instead maybe, if it's really considered that ugly. Personally I'd
> just document the unique expression index hack.
>
> CREATE UNIQUE INDEX onerow ON mytable((1));
>
> >> * On what basis do you _oppose_ persistently defining variables in the
> >> catalogs as their own entities?
> >
> > In understand that you are speaking of "persistent session variables".
> >
> > For me a database is about persistence (metadata & data) with safety
> > (transactions) and security (permissions)... and maybe performance:-)
> >
> > Pavel's proposal creates a new object with 2 (secure
> metadata-persistence)
> > out of 4 properties... I'm not a ease with introducting a new
> half-database
> > concept in a database.
>
> I strongly disagree. If you want "all-database" properties ... use tables.
>
> We generally add new features when that's not sufficient to achieve
> something. Most notably SEQUENCEs, which deliberately violate
> transaction isolation and atomicity in order to deliver a compelling
> benefit not otherwise achieveable.
>
> Similarly for advisory locking.
>
> > On the other hand there are dynamic session variables (mysql, mssql,
> oracle
> > have some variants) which are useful on their own without pretending to
> be
> > database objects (no CREATE/ALTER/DROP, GRANT/REVOKE).
>
> We have precent here for sequences. Yes, they do confuse users, but
> they're also VERY useful, and the properties of variables would be
> clearer IMO.
>
> I'm not especially attached to doing them as database objects; I'm
> just as happy with something declared at session start by some
> function that then intends to set and use the variable. But I don't
> think your argument against a DDL-like approach holds water.
>
> >> (My own objection is that "temporary variables" would make our existing
> >> catalog bloat issues for temp objects even worse).
> >
> >
> > I do agree that inefficient temporary variables are worthless, but ISTM
> that
> > Pavel's proposal is not exactly about temporary variables, it is about
> > temporary-valued permanent-variables. So there is no temporary (on the
> fly)
> > variable as such, and if it is extended for this purpose then indeed the
> > catalog costs look expensive.
>
> I meant that we'd certainly want CREATE TEMPORARY VARIABLE for ones
> that go away at end of session, if we were going to have
> catalog-object-like variables. Which would result in catalog bloat.
>

Because our catalog is MVCC, then bloating is unremovable - but if we
implement global temporary tables, then metadata of temporary objects can
be stored there - the main catalogue can be stable.

But the question? When you would to use local temporary variables? When you
cannot to use global variables? Probably in adhoc scripts, in interactive
work, ... It is minimal impact on catalogue.

The performance problems can be in PL usage, or intensive application usage
- and there can be used global variables.

Analogy with our temporary tables - if we can use global temporary tables
in critical PL, then local temporary tables can be nice feature perfect for
interactive work, and nobody have to fix a catalogue bloat.

Design of possibility to do local temporary variable is minimal work. I
don't afraid about performance when developers can use global variables as
option

Regards

Pavel


> > (1) Having some kind of variable, especially in interactive mode, allows
> to
> > manipulate previous results and reuse them later, without having to
> resort
> > to repeated sub-queries or to retype non trivial values.
> >
> > Client side psql :-variables are untyped and unescaped, thus not very
> > 

Re: [HACKERS] use strict in all Perl programs

2016-12-30 Thread Michael Paquier
On Sat, Dec 31, 2016 at 3:07 PM, Peter Eisentraut
 wrote:
> Here is a patch to add 'use strict' to all Perl programs (that I could
> find), or move it to the right place where it was already there.  I
> think that is a pretty standard thing to do nowadays.
>
> I tried testing the changes in pgcheckdefines, but it just spits out
> nonsense before and after.

What about adding as well "use warnings"? That's standard in all the TAP tests.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2016-12-30 Thread Peter Eisentraut
On 9/8/16 4:06 PM, Peter Eisentraut wrote:
> On 9/3/16 2:41 PM, Vik Fearing wrote:
>> On 08/31/2016 06:22 AM, Peter Eisentraut wrote:
>>> Here is a patch that adds the notion of a data type to a sequence.  So
>>> it might be CREATE SEQUENCE foo AS integer.  The types are restricted to
>>> int{2,4,8} as now.
>>
>> This patch does not apply cleanly to current master (=600dc4c).
> 
> Updated patch attached.

Another updated patch, with quite a bit of rebasing and some minor code
polishing.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 62486c9092f21a1afc1bd9cfa50f570e9e3e92c1 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 23 Dec 2016 12:00:00 -0500
Subject: [PATCH v3] Add CREATE SEQUENCE AS  clause

This stores a data type, required to be an integer type, with the
sequence.  The sequences min and max values default to the range
supported by the type, and they cannot be set to values exceeding that
range.  The internal implementation of the sequence is not affected.

Change the serial types to create sequences of the appropriate type.
This makes sure that the min and max values of the sequence for a serial
column match the range of values supported by the table column.  So the
sequence can no longer overflow the table column.

This also makes monitoring for sequence exhaustion/wraparound easier,
which currently requires various contortions to cross-reference the
sequences with the table columns they are used with.

This commit also effectively reverts the pg_sequence column reordering
in f3b421da5f4addc95812b9db05a24972b8fd9739, because the new seqtypid
column allows us to fill the hole in the struct and create a more
natural overall column ordering.
---
 doc/src/sgml/catalogs.sgml  |  14 +++-
 doc/src/sgml/information_schema.sgml|   4 +-
 doc/src/sgml/ref/create_sequence.sgml   |  37 ++
 src/backend/catalog/information_schema.sql  |   4 +-
 src/backend/commands/sequence.c |  95 ++---
 src/backend/parser/gram.y   |   6 +-
 src/backend/parser/parse_utilcmd.c  |   2 +-
 src/bin/pg_dump/pg_dump.c   | 105 +++-
 src/bin/pg_dump/t/002_pg_dump.pl|   2 +
 src/include/catalog/catversion.h|   2 +-
 src/include/catalog/pg_proc.h   |   2 +-
 src/include/catalog/pg_sequence.h   |   8 ++-
 src/include/pg_config_manual.h  |   6 --
 src/test/modules/test_pg_dump/t/001_base.pl |   1 +
 src/test/regress/expected/sequence.out  |  51 ++
 src/test/regress/expected/sequence_1.out|  51 ++
 src/test/regress/sql/sequence.sql   |  24 +--
 17 files changed, 291 insertions(+), 123 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 493050618d..765bc12c51 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5628,10 +5628,11 @@ pg_sequence Columns
  
 
  
-  seqcycle
-  bool
+  seqtypid
+  oid
+  pg_type.oid
   
-  Whether the sequence cycles
+  Data type of the sequence
  
 
  
@@ -5668,6 +5669,13 @@ pg_sequence Columns
   
   Cache size of the sequence
  
+
+ 
+  seqcycle
+  bool
+  
+  Whether the sequence cycles
+ 
 

   
diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index c43e325d06..a3a19ce8ce 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -4653,9 +4653,7 @@ sequences Columns
   data_type
   character_data
   
-   The data type of the sequence.  In
-   PostgreSQL, this is currently always
-   bigint.
+   The data type of the sequence.
   
  
 
diff --git a/doc/src/sgml/ref/create_sequence.sgml b/doc/src/sgml/ref/create_sequence.sgml
index 62ae379226..f31b59569e 100644
--- a/doc/src/sgml/ref/create_sequence.sgml
+++ b/doc/src/sgml/ref/create_sequence.sgml
@@ -21,7 +21,9 @@
 
  
 
-CREATE [ TEMPORARY | TEMP ] SEQUENCE [ IF NOT EXISTS ] name [ INCREMENT [ BY ] increment ]
+CREATE [ TEMPORARY | TEMP ] SEQUENCE [ IF NOT EXISTS ] name
+[ AS data_type ]
+[ INCREMENT [ BY ] increment ]
 [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ]
 [ START [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ]
 [ OWNED BY { table_name.column_name | NONE } ]
@@ -111,6 +113,21 @@ Parameters

 

+data_type
+
+ 
+  The optional
+  clause AS data_type
+  specifies the data type of the sequence.  Valid types are
+  are smallint, integer,
+  and bigint.  bigint is the
+  default.  The data type determines the default minimum and maximum
+  values of the sequence.
+ 
+
+   
+
+   
 increment
 
  
@@ -132,9 +149,9 @@ Parameters
   

[HACKERS] cast result of copyNode()

2016-12-30 Thread Peter Eisentraut
In order to reduce the number of useless casts and make the useful casts
more interesting, here is a patch that automatically casts the result of
copyNode() back to the input type, if the compiler supports something
like typeof(), which most current compilers appear to.  That gets us
some more type safety and we only need to retain the casts that actually
do change the type.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 4782417a58c44d08c461dd90887ac463631fbd4a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 25 Dec 2016 12:00:00 -0500
Subject: [PATCH] Cast result of copyObject() to correct type

copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.
---
 config/c-compiler.m4  | 27 +
 configure | 40 +++
 configure.in  |  1 +
 src/backend/bootstrap/bootstrap.c |  4 ++--
 src/backend/commands/createas.c   |  2 +-
 src/backend/commands/event_trigger.c  |  8 +++
 src/backend/commands/prepare.c|  4 ++--
 src/backend/commands/view.c   |  2 +-
 src/backend/nodes/copyfuncs.c |  4 
 src/backend/optimizer/path/indxpath.c |  4 ++--
 src/backend/optimizer/plan/createplan.c   |  6 ++---
 src/backend/optimizer/plan/initsplan.c|  8 +++
 src/backend/optimizer/plan/planagg.c  |  4 ++--
 src/backend/optimizer/plan/planner.c  |  4 ++--
 src/backend/optimizer/plan/setrefs.c  | 26 ++--
 src/backend/optimizer/plan/subselect.c| 14 +--
 src/backend/optimizer/prep/prepjointree.c |  2 +-
 src/backend/optimizer/prep/preptlist.c|  2 +-
 src/backend/optimizer/prep/prepunion.c|  4 ++--
 src/backend/optimizer/util/tlist.c| 10 
 src/backend/parser/analyze.c  |  2 +-
 src/backend/parser/gram.y |  2 +-
 src/backend/parser/parse_clause.c |  2 +-
 src/backend/parser/parse_expr.c   |  2 +-
 src/backend/parser/parse_relation.c   |  2 +-
 src/backend/parser/parse_utilcmd.c|  6 ++---
 src/backend/rewrite/rewriteHandler.c  |  8 +++
 src/backend/rewrite/rewriteManip.c|  8 +++
 src/backend/tcop/postgres.c   |  6 ++---
 src/backend/utils/cache/plancache.c   | 14 +--
 src/backend/utils/cache/relcache.c|  8 +++
 src/include/nodes/nodes.h |  4 
 src/include/optimizer/tlist.h |  4 ++--
 src/include/pg_config.h.in|  6 +
 src/include/pg_config.h.win32 |  6 +
 35 files changed, 172 insertions(+), 84 deletions(-)

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 7d901e1f1a..7afaec5f85 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -178,6 +178,33 @@ fi])# PGAC_C_STATIC_ASSERT
 
 
 
+# PGAC_C_TYPEOF
+# -
+# Check if the C compiler understands typeof or a variant.  Define
+# HAVE_TYPEOF if so, and define 'typeof' to the actual key word.
+#
+AC_DEFUN([PGAC_C_TYPEOF],
+[AC_CACHE_CHECK(for typeof, pgac_cv_c_typeof,
+[pgac_cv_c_typeof=no
+for pgac_kw in typeof __typeof__ decltype; do
+  AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
+[int x = 0;
+$pgac_kw(x) y;
+y = x;
+return y;])],
+[pgac_cv_c_typeof=$pgac_kw])
+  test "$pgac_cv_c_typeof" != no && break
+done])
+if test "$pgac_cv_c_typeof" != no; then
+  AC_DEFINE(HAVE_TYPEOF, 1,
+[Define to 1 if your compiler understands `typeof' or something similar.])
+  if test "$pgac_cv_c_typeof" != typeof; then
+AC_DEFINE(typeof, $pgac_cv_c_typeof, [Define to how the compiler spells `typeof'.])
+  fi
+fi])# PGAC_C_TYPEOF
+
+
+
 # PGAC_C_TYPES_COMPATIBLE
 # ---
 # Check if the C compiler understands __builtin_types_compatible_p,
diff --git a/configure b/configure
index 0f143a0fad..5c2e0145f1 100755
--- a/configure
+++ b/configure
@@ -11359,6 +11359,46 @@ if test x"$pgac_cv__static_assert" = xyes ; then
 $as_echo "#define HAVE__STATIC_ASSERT 1" >>confdefs.h
 
 fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for typeof" >&5
+$as_echo_n "checking for typeof... " >&6; }
+if ${pgac_cv_c_typeof+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_cv_c_typeof=no
+for pgac_kw in typeof __typeof__ decltype; do
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+int x = 0;
+$pgac_kw(x) y;
+y = x;
+return y;
+  

[HACKERS] use strict in all Perl programs

2016-12-30 Thread Peter Eisentraut
Here is a patch to add 'use strict' to all Perl programs (that I could
find), or move it to the right place where it was already there.  I
think that is a pretty standard thing to do nowadays.

I tried testing the changes in pgcheckdefines, but it just spits out
nonsense before and after.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6db551f6ba2a9339051ecc7cabeb29ff59de2b26 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 4 Dec 2016 12:00:00 -0500
Subject: [PATCH 1/2] Use 'use strict' in all Perl programs

---
 contrib/seg/seg-validate.pl| 35 +++---
 contrib/seg/sort-segments.pl   | 10 +--
 doc/src/sgml/mk_feature_tables.pl  |  2 ++
 src/pl/plperl/plc_perlboot.pl  |  2 ++
 src/test/locale/sort-test.pl   |  2 ++
 src/tools/msvc/build.pl|  4 ++-
 src/tools/msvc/gendef.pl   |  6 ++--
 src/tools/msvc/pgflex.pl   |  6 ++--
 src/tools/pginclude/pgcheckdefines | 59 ++
 src/tools/version_stamp.pl | 18 
 10 files changed, 87 insertions(+), 57 deletions(-)

diff --git a/contrib/seg/seg-validate.pl b/contrib/seg/seg-validate.pl
index cb3fb9a099..b8957ed984 100755
--- a/contrib/seg/seg-validate.pl
+++ b/contrib/seg/seg-validate.pl
@@ -1,20 +1,23 @@
 #!/usr/bin/perl
-$integer = '[+-]?[0-9]+';
-$real= '[+-]?[0-9]+\.[0-9]+';
-
-$RANGE = '(\.\.)(\.)?';
-$PLUMIN= q(\'\+\-\');
-$FLOAT = "(($integer)|($real))([eE]($integer))?";
-$EXTENSION = '<|>|~';
-
-$boundary  = "($EXTENSION)?$FLOAT";
-$deviation = $FLOAT;
-
-$rule_1 = $boundary . $PLUMIN . $deviation;
-$rule_2 = $boundary . $RANGE . $boundary;
-$rule_3 = $boundary . $RANGE;
-$rule_4 = $RANGE . $boundary;
-$rule_5 = $boundary;
+
+use strict;
+
+my $integer = '[+-]?[0-9]+';
+my $real= '[+-]?[0-9]+\.[0-9]+';
+
+my $RANGE = '(\.\.)(\.)?';
+my $PLUMIN= q(\'\+\-\');
+my $FLOAT = "(($integer)|($real))([eE]($integer))?";
+my $EXTENSION = '<|>|~';
+
+my $boundary  = "($EXTENSION)?$FLOAT";
+my $deviation = $FLOAT;
+
+my $rule_1 = $boundary . $PLUMIN . $deviation;
+my $rule_2 = $boundary . $RANGE . $boundary;
+my $rule_3 = $boundary . $RANGE;
+my $rule_4 = $RANGE . $boundary;
+my $rule_5 = $boundary;
 
 
 print "$rule_5\n";
diff --git a/contrib/seg/sort-segments.pl b/contrib/seg/sort-segments.pl
index a465468d5b..04eafd92f2 100755
--- a/contrib/seg/sort-segments.pl
+++ b/contrib/seg/sort-segments.pl
@@ -2,6 +2,10 @@
 
 # this script will sort any table with the segment data type in its last column
 
+use strict;
+
+my @rows;
+
 while (<>)
 {
 	chomp;
@@ -10,11 +14,11 @@
 
 foreach (
 	sort {
-		@ar = split("\t", $a);
-		$valA = pop @ar;
+		my @ar = split("\t", $a);
+		my $valA = pop @ar;
 		$valA =~ s/[~<> ]+//g;
 		@ar = split("\t", $b);
-		$valB = pop @ar;
+		my $valB = pop @ar;
 		$valB =~ s/[~<> ]+//g;
 		$valA <=> $valB
 	} @rows)
diff --git a/doc/src/sgml/mk_feature_tables.pl b/doc/src/sgml/mk_feature_tables.pl
index 45dea798cd..93dab2132e 100644
--- a/doc/src/sgml/mk_feature_tables.pl
+++ b/doc/src/sgml/mk_feature_tables.pl
@@ -2,6 +2,8 @@
 
 # doc/src/sgml/mk_feature_tables.pl
 
+use strict;
+
 my $yesno = $ARGV[0];
 
 open PACK, $ARGV[1] or die;
diff --git a/src/pl/plperl/plc_perlboot.pl b/src/pl/plperl/plc_perlboot.pl
index d506d01163..bb2d009be0 100644
--- a/src/pl/plperl/plc_perlboot.pl
+++ b/src/pl/plperl/plc_perlboot.pl
@@ -1,5 +1,7 @@
 #  src/pl/plperl/plc_perlboot.pl
 
+use strict;
+
 use 5.008001;
 use vars qw(%_SHARED $_TD);
 
diff --git a/src/test/locale/sort-test.pl b/src/test/locale/sort-test.pl
index ce7b93c571..cb7e4934e4 100755
--- a/src/test/locale/sort-test.pl
+++ b/src/test/locale/sort-test.pl
@@ -1,4 +1,6 @@
 #! /usr/bin/perl
+
+use strict;
 use locale;
 
 open(INFILE, "<$ARGV[0]");
diff --git a/src/tools/msvc/build.pl b/src/tools/msvc/build.pl
index a5469cd289..2e7c54853a 100644
--- a/src/tools/msvc/build.pl
+++ b/src/tools/msvc/build.pl
@@ -2,6 +2,8 @@
 
 # src/tools/msvc/build.pl
 
+use strict;
+
 BEGIN
 {
 
@@ -68,6 +70,6 @@ BEGIN
 
 # report status
 
-$status = $? >> 8;
+my $status = $? >> 8;
 
 exit $status;
diff --git a/src/tools/msvc/gendef.pl b/src/tools/msvc/gendef.pl
index a6c43c2c39..3bcff7ffaf 100644
--- a/src/tools/msvc/gendef.pl
+++ b/src/tools/msvc/gendef.pl
@@ -1,11 +1,11 @@
-my @def;
-
-use warnings;
 use strict;
+use warnings;
 use 5.8.0;
 use File::Spec::Functions qw(splitpath catpath);
 use List::Util qw(max);
 
+my @def;
+
 #
 # Script that generates a .DEF file for all objects in a directory
 #
diff --git a/src/tools/msvc/pgflex.pl b/src/tools/msvc/pgflex.pl
index 474ce63e5c..3a42add0d2 100644
--- a/src/tools/msvc/pgflex.pl
+++ b/src/tools/msvc/pgflex.pl
@@ -2,12 +2,12 @@
 
 # src/tools/msvc/pgflex.pl
 
-# silence flex bleatings about file path style
-$ENV{CYGWIN} = 'nodosfilewarning';
-
 use strict;
 use File::Basename;
 
+# silence flex bleatings about 

[HACKERS] port of INSTALL file generation to XSLT

2016-12-30 Thread Peter Eisentraut
A further step toward getting rid of the DSSSL tool chain requirement,
here is a patch to change the generation of the text INSTALL file to use
XLST and Pandoc.

The change to Pandoc is not essential to this change, but it creates
much better looking output and simplifies the locale/encoding handling
over using Lynx.

We'll need to get Pandoc installed on borka and check that that version
works as well as the one I have been using.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 41fe676ae519e0601b67e5ea57cf32eeb266f40f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 27 Dec 2016 12:00:00 -0500
Subject: [PATCH] Create INSTALL file via XSLT and Pandoc

Replace the old tool chain of jade and lynx with using xsltproc and
pandoc.  As before, we create an intermediate HTML file and convert that
to plain text.  Replacing jade with xsltproc removes jade from the
requirements for distribution building.  The change to pandoc is
incidental, but it creates better looking output and it avoids the
delicate locale/encoding issues of lynx because it always uses UTF-8 for
both input and output.
---
 doc/src/sgml/.gitignore  |  1 +
 doc/src/sgml/Makefile| 47 +++--
 doc/src/sgml/stylesheet-text.xsl | 89 
 3 files changed, 113 insertions(+), 24 deletions(-)
 create mode 100644 doc/src/sgml/stylesheet-text.xsl

diff --git a/doc/src/sgml/.gitignore b/doc/src/sgml/.gitignore
index 2f0329c15f..8197c0140d 100644
--- a/doc/src/sgml/.gitignore
+++ b/doc/src/sgml/.gitignore
@@ -21,6 +21,7 @@
 # Assorted byproducts from building the above
 /postgres.xml
 /INSTALL.html
+/INSTALL.xml
 /postgres-US.aux
 /postgres-US.log
 /postgres-US.out
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index fe7ca65cd4..0c18a8a84a 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -217,26 +217,22 @@ postgres.pdf:
 
 
 ##
-## Semi-automatic generation of some text files.
+## Generation of some text files
 ##
 
-JADE.text = $(JADE) $(JADEFLAGS) $(SGMLINCLUDE) $(CATALOG) -d stylesheet.dsl -i output-text -t sgml
 ICONV = iconv
-LYNX = lynx
-
-# The documentation may contain non-ASCII characters (mostly for
-# contributor names), which lynx converts to the encoding determined
-# by the current locale.  To get text output that is deterministic and
-# easily readable by everyone, we make lynx produce LATIN1 and then
-# convert that to ASCII with transliteration for the non-ASCII characters.
-# Official releases were historically built on FreeBSD, which has limited
-# locale support and is very picky about locale name spelling.  The
-# below has been finely tuned to run on FreeBSD and Linux/glibc.
+PANDOC = pandoc
+
 INSTALL: % : %.html
-	$(PERL) -p -e 's/ $@
+	$(PANDOC) $< -t plain | $(ICONV) -f utf8 -t us-ascii//TRANSLIT > $@
+
+INSTALL.html: %.html : stylesheet-text.xsl %.xml
+	$(XMLLINT) --noout --valid $*.xml
+	$(XSLTPROC) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $^ >$@
 
-INSTALL.html: standalone-install.sgml installation.sgml version.sgml
-	$(JADE.text) -V nochunks standalone-install.sgml installation.sgml > $@
+INSTALL.xml: standalone-install.sgml installation.sgml version.sgml
+	$(OSX) -D. -x lower $(filter-out version.sgml,$^) >$@.tmp
+	$(call mangle-xml,chapter)
 
 
 ##
@@ -247,12 +243,15 @@ INSTALL.html: standalone-install.sgml installation.sgml version.sgml
 # if we try to do "make all" in a VPATH build without the explicit
 # $(srcdir) on the postgres.sgml dependency in this rule.  GNU make bug?
 postgres.xml: $(srcdir)/postgres.sgml $(ALMOSTALLSGML)
-	$(OSX) -D. -x lower -i include-xslt-index $< >postgres.xmltmp
-	$(PERL) -p -e 's/\[(aacute|acirc|aelig|agrave|amp|aring|atilde|auml|bull|copy|eacute|egrave|gt|iacute|lt|mdash|nbsp|ntilde|oacute|ocirc|oslash|ouml|pi|quot|scaron|uuml) *\]/\&\1;/gi;' \
-	   -e '$$_ .= qq{http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd;>\n} if $$. == 1;' \
-	   $@
-	rm postgres.xmltmp
-# ' hello Emacs
+	$(OSX) -D. -x lower -i include-xslt-index $< >$@.tmp
+	$(call mangle-xml,book)
+
+define mangle-xml
+$(PERL) -p -e 's/\[(aacute|acirc|aelig|agrave|amp|aring|atilde|auml|bull|copy|eacute|egrave|gt|iacute|lt|mdash|nbsp|ntilde|oacute|ocirc|oslash|ouml|pi|quot|scaron|uuml) *\]/\&\1;/gi;' \
+   -e '$$_ .= qq{http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd;>\n} if $$. == 1;' \
+  <$@.tmp >$@
+rm $@.tmp
+endef
 
 ifeq ($(STYLE),website)
 XSLTPROC_HTML_FLAGS += --param website.stylesheet 1
@@ -386,13 +385,13 @@ check-tabs:
 # This allows removing some files from the distribution tarballs while
 # keeping the dependencies satisfied.
 .SECONDARY: postgres.xml $(GENERATED_SGML) HTML.index
-.SECONDARY: INSTALL.html
+.SECONDARY: INSTALL.html 

[HACKERS] Group clear xid can leak semaphore count

2016-12-30 Thread Amit Kapila
During the review of Group update Clog patch [1], Dilip noticed an
issue with the patch where it can leak the semaphore count in one of
the corner case.  I have checked and found that similar issue exists
for Group clear xid (ProcArrayGroupClearXid) as well.  I think the
reason why this problem is not noticed by anyone till now is that it
can happen only in a rare scenario when the backend waiting for xid
clear is woken up due to some unrelated signal.  This problem didn't
exist in the original commit
(0e141c0fbb211bdd23783afa731e3eef95c9ad7a) of the patch, but later
while fixing some issues in the committed patch, it got introduced in
commit 4aec49899e5782247e134f94ce1c6ee926f88e1c. Patch to fix the
issue is attached.


[1] - 
https://www.postgresql.org/message-id/CAA4eK1J%2B67edo_Wnrfx8oJ%2BrWM_BAr%2Bv6JqvQjKPdLOxR%3D0d5g%40mail.gmail.com
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_absorbed_wakeups_clear_xid_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add doc advice about systemd RemoveIPC

2016-12-30 Thread Peter Eisentraut
On 12/30/16 3:59 AM, Magnus Hagander wrote:
> I wonder if I missed part of the discussions around this, so maybe my
> understanding of the cases where this occurs is wrong, but isn't it the
> case of pretty much all (or actually) all the packaged versions of
> postgresql out there (debian, redhat etc) that they do the right thing,
> as in that they create "postgres" as a system user?

If you install a package but the user already exists, then the package
will just use that user.  So just using a package is not a guarantee
that everything will be alright.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Short reads in hash indexes

2016-12-30 Thread Amit Kapila
On Fri, Dec 30, 2016 at 3:45 AM, Andreas Seltenreich  wrote:
> Amit Kapila writes:
>
>> Can you please try with the patch posted on hash index thread [1] to
>> see if you can reproduce any of these problems?
>>
>> [1] - 
>> https://www.postgresql.org/message-id/CAA4eK1Kf6tOY0oVz_SEdngiNFkeXrA3xUSDPPORQvsWVPdKqnA%40mail.gmail.com
>
> I'm no longer seeing the failed assertions nor short reads since these
> patches are in.
>

Thanks for the confirmation!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-12-30 Thread Amit Kapila
On Thu, Dec 29, 2016 at 10:41 AM, Dilip Kumar  wrote:
>
> I have done one more pass of the review today. I have few comments.
>
> + if (nextidx != INVALID_PGPROCNO)
> + {
> + /* Sleep until the leader updates our XID status. */
> + for (;;)
> + {
> + /* acts as a read barrier */
> + PGSemaphoreLock(>sem);
> + if (!proc->clogGroupMember)
> + break;
> + extraWaits++;
> + }
> +
> + Assert(pg_atomic_read_u32(>clogGroupNext) == INVALID_PGPROCNO);
> +
> + /* Fix semaphore count for any absorbed wakeups */
> + while (extraWaits-- > 0)
> + PGSemaphoreUnlock(>sem);
> + return true;
> + }
>
> 1. extraWaits is used only locally in this block so I guess we can
> declare inside this block only.
>

Agreed and changed accordingly.

> 2. It seems that we have missed one unlock in case of absorbed
> wakeups. You have initialised extraWaits with -1 and if there is one
> extra wake up then extraWaits will become 0 (it means we have made one
> extra call to PGSemaphoreLock and it's our responsibility to fix it as
> the leader will Unlock only once). But it appear in such case we will
> not make any call to PGSemaphoreUnlock.
>

Good catch!  I have fixed it by initialising extraWaits to 0.  This
same issue exists from Group clear xid for which I will send a patch
separately.

Apart from above, the patch needs to be adjusted for commit be7b2848
which has changed the definition of PGSemaphore.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


group_update_clog_v10.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add support to COMMENT ON CURRENT DATABASE

2016-12-30 Thread Fabrízio de Royes Mello
Hi all,

The attached patch is reworked from a previous one [1] to better deal with
get_object_address and pg_get_object_address.

Regards,

[1] https://www.postgresql.org/message-id/20150317171836.gc10...@momjian.us

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/doc/src/sgml/ref/comment.sgml b/doc/src/sgml/ref/comment.sgml
index c1cf587..51b39ab 100644
--- a/doc/src/sgml/ref/comment.sgml
+++ b/doc/src/sgml/ref/comment.sgml
@@ -31,6 +31,7 @@ COMMENT ON
   CONSTRAINT constraint_name ON table_name |
   CONSTRAINT constraint_name ON DOMAIN domain_name |
   CONVERSION object_name |
+  CURRENT DATABASE |
   DATABASE object_name |
   DOMAIN object_name |
   EXTENSION object_name |
@@ -96,6 +97,11 @@ COMMENT ON
   
 
   
+   The CURRENT DATABASE means the comment will be applied to the database
+   where the command is executed.
+  
+
+  
 Comments can be viewed using psql's
 \d family of commands.
 Other user interfaces to retrieve comments can be built atop
@@ -307,6 +313,7 @@ COMMENT ON COLUMN my_table.my_column IS 'Employee ID number';
 COMMENT ON CONVERSION my_conv IS 'Conversion to UTF8';
 COMMENT ON CONSTRAINT bar_col_cons ON bar IS 'Constrains column col';
 COMMENT ON CONSTRAINT dom_col_constr ON DOMAIN dom IS 'Constrains col of domain';
+COMMENT ON CURRENT DATABASE IS 'Current Database Comment';
 COMMENT ON DATABASE my_database IS 'Development Database';
 COMMENT ON DOMAIN my_domain IS 'Email Address Domain';
 COMMENT ON EXTENSION hstore IS 'implements the hstore data type';
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index bb4b080..71bffae 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -621,6 +621,9 @@ static const struct object_type_map
 	{
 		"database", OBJECT_DATABASE
 	},
+	{
+		"current database", OBJECT_DATABASE
+	},
 	/* OCLASS_TBLSPACE */
 	{
 		"tablespace", OBJECT_TABLESPACE
@@ -1053,9 +1056,12 @@ get_object_address_unqualified(ObjectType objtype,
 
 	/*
 	 * The types of names handled by this function are not permitted to be
-	 * schema-qualified or catalog-qualified.
+	 * schema-qualified or catalog-qualified. 
+	 *
+	 * If "CURRENT DATABASE" is used the target object name is NIL so we
+	 * don't need to do this check.
 	 */
-	if (list_length(qualname) != 1)
+	if (list_length(qualname) > 1)
 	{
 		const char *msg;
 
@@ -1101,7 +1107,10 @@ get_object_address_unqualified(ObjectType objtype,
 	}
 
 	/* Format is valid, extract the actual name. */
-	name = strVal(linitial(qualname));
+	if (list_length(qualname) > 0)
+		name = strVal(linitial(qualname));
+	else
+		name = NULL;
 
 	/* Translate name to OID. */
 	switch (objtype)
@@ -1113,7 +1122,10 @@ get_object_address_unqualified(ObjectType objtype,
 			break;
 		case OBJECT_DATABASE:
 			address.classId = DatabaseRelationId;
-			address.objectId = get_database_oid(name, missing_ok);
+			if (name != NULL)
+address.objectId = get_database_oid(name, missing_ok);
+			else
+address.objectId = MyDatabaseId;
 			address.objectSubId = 0;
 			break;
 		case OBJECT_EXTENSION:
@@ -1951,7 +1963,7 @@ pg_get_object_address(PG_FUNCTION_ARGS)
 	else
 	{
 		name = textarray_to_strvaluelist(namearr);
-		if (list_length(name) < 1)
+		if (list_length(name) < 1 && type != OBJECT_DATABASE)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 	 errmsg("name list length must be at least %d", 1)));
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 08cf5b7..b87aa5a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -6101,7 +6101,7 @@ opt_restart_seqs:
  *	the object associated with the comment. The form of the statement is:
  *
  *	COMMENT ON [ [ ACCESS METHOD | CONVERSION | COLLATION |
- * DATABASE | DOMAIN |
+ * DATABASE | CURRENT DATABASE | DOMAIN |
  * EXTENSION | EVENT TRIGGER | FOREIGN DATA WRAPPER |
  * FOREIGN TABLE | INDEX | [PROCEDURAL] LANGUAGE |
  * MATERIALIZED VIEW | POLICY | ROLE | SCHEMA | SEQUENCE |
@@ -6135,6 +6135,15 @@ CommentStmt:
 	n->comment = $6;
 	$$ = (Node *) n;
 }
+			| COMMENT ON CURRENT_P DATABASE IS comment_text
+{
+	CommentStmt *n = makeNode(CommentStmt);
+	n->objtype = OBJECT_DATABASE;
+	n->objname = NIL;
+	n->objargs = NIL;
+	n->comment = $6;
+	$$ = (Node *) n;
+}
 			| COMMENT ON TYPE_P Typename IS comment_text
 {
 	CommentStmt *n = makeNode(CommentStmt);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e5545b3..2e4746a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -2679,10 +2679,20 @@ dumpDatabase(Archive *fout)
 			

Re: [HACKERS] Unusable SP-GiST index

2016-12-30 Thread Tom Lane
I wrote:
> Maybe we should redefine the API as involving a TupleTableSlot that
> the AM is supposed to fill --- basically, moving StoreIndexTuple
> out of the common code in nodeIndexonlyscan.c and requiring the AM
> to do that work.  The possible breakage of third-party code is a
> bit annoying, but there can't be all that many third-party AMs
> out there yet.

After looking a bit at gist and sp-gist, neither of them would find that
terribly convenient; they really want to create one blob of memory per
index entry so as to not complicate storage management too much.  But
they'd be fine with making that blob be a HeapTuple not IndexTuple.
So maybe the right approach is to expand the existing API to allow the
AM to return *either* a heap or index tuple; that could be made to not
be an API break.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unusable SP-GiST index

2016-12-30 Thread Tom Lane
Vik Fearing  writes:
> While trying to find a case where spgist wins over btree for text, I
> came across the following behavior which I would consider a bug:

> CREATE TABLE texts (value text);
> INSERT INTO texts SELECT repeat('a', (2^20)::integer);
> CREATE INDEX ON texts USING spgist (value);
> SET enable_seqscan = off;
> TABLE texts;

> That produces:
> ERROR:  index row requires 12024 bytes, maximum size is 8191

Hmm ... it's not really SP-GiST's fault.  This query is trying to do
an index-only scan, and the API defined for that requires the index
to hand back an IndexTuple, which is of (very) limited size.
SP-GiST is capable of dealing with values much larger than one page,
but there's no way to hand them back through that API.

Maybe we should redefine the API as involving a TupleTableSlot that
the AM is supposed to fill --- basically, moving StoreIndexTuple
out of the common code in nodeIndexonlyscan.c and requiring the AM
to do that work.  The possible breakage of third-party code is a
bit annoying, but there can't be all that many third-party AMs
out there yet.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 10:59 PM, Ashutosh Bapat
 wrote:
>>
>> Well, flushing the meta-data of pg_twophase is really going to be far
>> cheaper than the many pages done until CheckpointTwoPhase is reached.
>> There should really be a check on serialized_xacts for the
>> non-recovery code path, but considering how cheap that's going to be
>> compared to the rest of the restart point stuff it is not worth the
>> complexity of adding a counter, for example in shared memory with
>> XLogCtl (the counter gets reinitialized at each checkpoint,
>> incremented when replaying a 2PC prepare, decremented with a 2PC
>> commit). So to reduce the backpatch impact I would just do the fsync
>> if (serialized_xact > 0 || RecoveryInProgress()) and call it a day.
>
> Sounds ok to me. I will leave it to the committer to decide further.

Attached is an updated patch. I also noticed that it is better to do
the fsync call before the dtrace call for checkpoint end as well as
logging.
-- 
Michael


2pc-loss-v2.patch
Description: application/stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Craig Ringer
On 30 December 2016 at 21:00, Fabien COELHO  wrote:

> As for "slow", I have just tested overheads with pgbench, comparing a direct
> arithmetic operation (as a proxy to a fast session variable consultation) to
> constant returning plpgsql functions with security definer and security
> invoker, on a direct socket connection, with prepared statements:
>
>   select 1 + 0: 0.020 ms
>   select one_sd() : 0.024 ms
>   select one_si() : 0.024 ms

That's one call per executor run. Not really an effective test.

Consider cases like row security where you're testing 1 rows.
Hopefully the planner will inline the test if it's a function declared
stable, but it may not.


> However the one-row property is just hoped for, and on principle a database
> is about declaring constraints that are enforced afterwards.
>
> I see two clean solutions to this use case: declaring tables as one row, or
> having scalar objects.


I agree that's a common issue.

The unique partial index on 1 hack in postgres works, though it's ugly.

Adding a whole new different storage concept seems like massive
overkill for this problem, which is minor and already easily solved.
Someone could make 1-row tables prettier with a new constraint type
instead maybe, if it's really considered that ugly. Personally I'd
just document the unique expression index hack.

CREATE UNIQUE INDEX onerow ON mytable((1));

>> * On what basis do you _oppose_ persistently defining variables in the
>> catalogs as their own entities?
>
> In understand that you are speaking of "persistent session variables".
>
> For me a database is about persistence (metadata & data) with safety
> (transactions) and security (permissions)... and maybe performance:-)
>
> Pavel's proposal creates a new object with 2 (secure metadata-persistence)
> out of 4 properties... I'm not a ease with introducting a new half-database
> concept in a database.

I strongly disagree. If you want "all-database" properties ... use tables.

We generally add new features when that's not sufficient to achieve
something. Most notably SEQUENCEs, which deliberately violate
transaction isolation and atomicity in order to deliver a compelling
benefit not otherwise achieveable.

Similarly for advisory locking.

> On the other hand there are dynamic session variables (mysql, mssql, oracle
> have some variants) which are useful on their own without pretending to be
> database objects (no CREATE/ALTER/DROP, GRANT/REVOKE).

We have precent here for sequences. Yes, they do confuse users, but
they're also VERY useful, and the properties of variables would be
clearer IMO.

I'm not especially attached to doing them as database objects; I'm
just as happy with something declared at session start by some
function that then intends to set and use the variable. But I don't
think your argument against a DDL-like approach holds water.

>> (My own objection is that "temporary variables" would make our existing
>> catalog bloat issues for temp objects even worse).
>
>
> I do agree that inefficient temporary variables are worthless, but ISTM that
> Pavel's proposal is not exactly about temporary variables, it is about
> temporary-valued permanent-variables. So there is no temporary (on the fly)
> variable as such, and if it is extended for this purpose then indeed the
> catalog costs look expensive.

I meant that we'd certainly want CREATE TEMPORARY VARIABLE for ones
that go away at end of session, if we were going to have
catalog-object-like variables. Which would result in catalog bloat.

> (1) Having some kind of variable, especially in interactive mode, allows to
> manipulate previous results and reuse them later, without having to resort
> to repeated sub-queries or to retype non trivial values.
>
> Client side psql :-variables are untyped and unescaped, thus not very
> convenient for this purpose.

You can currently (ab)use user defined GUCs for this. Ugly, but
effective, and honestly something we could bless into general use if
we decided to. It's not that bad.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Indirect indexes

2016-12-30 Thread Alvaro Herrera
Attached is v4, which fixes a couple of relatively minor bugs.  There
are still things to tackle before this is committable, but coding review
of the new executor node would be welcome.

The big remaining item is still fitting the PK data in TIDs 6 bytes.
I've been looking at reworking the btree code to allow for an arbitrary
size; it doesn't look impossible although it's going to be rather
invasive.  Also, vacuuming: my answer continues to be that the killtuple
interface should be good enough, but it's possible to vacuum the index
separately from vacuuming the table anyway if you do a full scan and
check the PK tuples for each indirect tuple.

This patch implements killtuple: a scan that sees an indirect tuple not
returning anything from the heap marks the tuple as LP_DEAD.  Later,
when the page is about to be split, those tuples are removed.

I also have a note in the code about not inserting an indirect tuple
when an identical one already exists.  This is a correctness issue: we
return duplicated heap rows in certain cases.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 40f201b..4b21216 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -1037,6 +1037,8 @@ amrestrpos (IndexScanDesc scan);
for the same tuple values as were used in the original insertion.
   
  
+
+ XXX describe UNIQUE_CHECK_INSERT_SINGLETON here 

 
   
 
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 1b45a4c..9f899c7 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -92,6 +92,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amstorage = true;
amroutine->amclusterable = false;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = InvalidOid;
 
amroutine->ambuild = brinbuild;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f07eedc..1bc91d2 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -49,6 +49,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amstorage = true;
amroutine->amclusterable = false;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = InvalidOid;
 
amroutine->ambuild = ginbuild;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index b8aa9bc..4ec34d5 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -69,6 +69,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amstorage = true;
amroutine->amclusterable = true;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = InvalidOid;
 
amroutine->ambuild = gistbuild;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 1fa087a..a2cf278 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -66,6 +66,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amstorage = false;
amroutine->amclusterable = false;
amroutine->ampredlocks = false;
+   amroutine->amcanindirect = false;
amroutine->amkeytype = INT4OID;
 
amroutine->ambuild = hashbuild;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 19edbdf..a6e859c 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3411,6 +3411,8 @@ simple_heap_delete(Relation relation, ItemPointer tid)
  * crosscheck - if not InvalidSnapshot, also check old tuple against this
  * wait - true if should wait for any conflicting update to commit/abort
  * hufd - output parameter, filled in failure cases (see below)
+ * unchanged_ind_cols - output parameter; bits set for unmodified columns
+ * that are indexed by indirect indexes
  * lockmode - output parameter, filled with lock mode acquired on tuple
  *
  * Normal, successful return value is HeapTupleMayBeUpdated, which
@@ -3433,13 +3435,15 @@ simple_heap_delete(Relation relation, ItemPointer tid)
 HTSU_Result
 heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
-   HeapUpdateFailureData *hufd, LockTupleMode *lockmode)
+   HeapUpdateFailureData *hufd, Bitmapset 
**unchanged_ind_cols,
+   LockTupleMode *lockmode)
 {
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset  *hot_attrs;
Bitmapset  *key_attrs;
Bitmapset  *id_attrs;
+   Bitmapset  *indirect_attrs;
Bitmapset  *interesting_attrs;
Bitmapset  *modified_attrs;
ItemId  lp;
@@ 

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread Vik Fearing
On 12/30/2016 06:46 PM, David Fetter wrote:
> On Fri, Dec 30, 2016 at 09:57:25AM -0500, Stephen Frost wrote:
> 
>> Let's make this a clean break/change.
> 
> +1 

+1
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unusable SP-GiST index

2016-12-30 Thread Vik Fearing
While trying to find a case where spgist wins over btree for text, I
came across the following behavior which I would consider a bug:

CREATE TABLE texts (value text);
INSERT INTO texts SELECT repeat('a', (2^20)::integer);
CREATE INDEX ON texts USING spgist (value);
SET enable_seqscan = off;
TABLE texts;

That produces:

ERROR:  index row requires 12024 bytes, maximum size is 8191

It seems to me the index should not be allowed to be created if it won't
be usable.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rewrite HeapSatisfiesHOTAndKey

2016-12-30 Thread Alvaro Herrera
Here's a version with fixed comments.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index ea579a0..19edbdf 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -95,11 +95,8 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer 
oldbuf,
Buffer newbuf, HeapTuple oldtup,
HeapTuple newtup, HeapTuple old_key_tup,
bool all_visible_cleared, bool 
new_all_visible_cleared);
-static void HeapSatisfiesHOTandKeyUpdate(Relation relation,
-Bitmapset *hot_attrs,
-Bitmapset *key_attrs, 
Bitmapset *id_attrs,
-bool *satisfies_hot, 
bool *satisfies_key,
-bool *satisfies_id,
+static Bitmapset *HeapDetermineModifiedColumns(Relation relation,
+Bitmapset 
*interesting_cols,
 HeapTuple oldtup, 
HeapTuple newtup);
 static bool heap_acquire_tuplock(Relation relation, ItemPointer tid,
 LockTupleMode mode, LockWaitPolicy 
wait_policy,
@@ -3443,6 +3440,8 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
Bitmapset  *hot_attrs;
Bitmapset  *key_attrs;
Bitmapset  *id_attrs;
+   Bitmapset  *interesting_attrs;
+   Bitmapset  *modified_attrs;
ItemId  lp;
HeapTupleData oldtup;
HeapTuple   heaptup;
@@ -3460,9 +3459,6 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
pagefree;
boolhave_tuple_lock = false;
booliscombo;
-   boolsatisfies_hot;
-   boolsatisfies_key;
-   boolsatisfies_id;
booluse_hot_update = false;
boolkey_intact;
boolall_visible_cleared = false;
@@ -3489,21 +3485,30 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
 errmsg("cannot update tuples during a parallel 
operation")));
 
/*
-* Fetch the list of attributes to be checked for HOT update.  This is
-* wasted effort if we fail to update or have to put the new tuple on a
-* different page.  But we must compute the list before obtaining buffer
-* lock --- in the worst case, if we are doing an update on one of the
-* relevant system catalogs, we could deadlock if we try to fetch the 
list
-* later.  In any case, the relcache caches the data so this is usually
-* pretty cheap.
+* Fetch the list of attributes to be checked for various operations.
 *
-* Note that we get a copy here, so we need not worry about relcache 
flush
-* happening midway through.
+* For HOT considerations, this is wasted effort if we fail to update or
+* have to put the new tuple on a different page.  But we must compute 
the
+* list before obtaining buffer lock --- in the worst case, if we are 
doing
+* an update on one of the relevant system catalogs, we could deadlock 
if
+* we try to fetch the list later.  In any case, the relcache caches the
+* data so this is usually pretty cheap.
+*
+* We also need columns used by the replica identity, the columns that
+* are considered the "key" of rows in the table, and columns that are
+* part of indirect indexes.
+*
+* Note that we get copies of each bitmap, so we need not worry about
+* relcache flush happening midway through.
 */
hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,

  INDEX_ATTR_BITMAP_IDENTITY_KEY);
+   interesting_attrs = bms_add_members(NULL, hot_attrs);
+   interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
+   interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
+
 
block = ItemPointerGetBlockNumber(otid);
buffer = ReadBuffer(relation, block);
@@ -3524,7 +3529,7 @@ heap_update(Relation relation, ItemPointer otid, 
HeapTuple newtup,
Assert(ItemIdIsNormal(lp));
 
/*
-* Fill in enough data in oldtup for HeapSatisfiesHOTandKeyUpdate to 
work
+* Fill in enough data in oldtup for HeapDetermineModifiedColumns to 

Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-30 18:39 GMT+01:00 Fabien COELHO :

>
> DECLARE @var EXTERNAL

>>>
>>> I do not know what you mean by 'EXTERNAL'.
>>>
>>
>> it means "used as shared in session"
>>
>
> Shared by whom? There is no such thing in the proposal I have made on the
> wiki or in mails. In the proposal variables are private to the role which
> creates them.


shared by functions in session.


>
>
> It is possible to find "some" typos, depending on the code: you can check
>>> that a variable is both assigned and used somewhere, otherwise it is very
>>> probably a typo. Perl does that *statically*, before executing a script.
>>>
>>
>> "Before execution" .. I am speaking "without execution".
>>
>
> Before execution is also without execution. You can run "perl -c" to get
> the warning.
>
> theoretically, if you check all possible functions, you can find some
>> issues - but you have to check all function on server.
>>
>
> Yes, sure. It seems to be the same with your proposal: if a hidden
> function drops and recreates a session variable with a different type, or
> changes its permission, then some static checks are void as well, that is
> life. Also, a SQL function may access and modify the variables
> unexpectedly, that would be missed by a PL/pgSQL analysis tool... There is
> no miracle.
>
>
No - metadata, in my design, are persistent - like tables - so you don't
calculate so any functions can drop a variables. The deployment of secure
variables is same like table deployment. No dynamic there.


> Basically, there may be issues even if static analysis tools says that all
> is well.
>
> Elsewhere you cannot to find typo in DECLARE statement.
>>
>
> Indeed, probably there exists some class of typos that may not be found by
> some static analysis implementations on PL/pgSQL functions which uses basic
> session variables.
>

yes, and I would not to append any new case that cannot be covered by
plpgsql check. Dynamic SQL and our temporal tables are enough issues
already.


>
> By the way, are you planing to contribute to the wiki?
>
> https://wiki.postgresql.org/wiki/Variable_Design


I wrote my notes there.




>
>
> --
> Fabien.
>


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread David Fetter
On Fri, Dec 30, 2016 at 09:57:25AM -0500, Stephen Frost wrote:

> Let's make this a clean break/change.

+1 

If there are known management gizmos to notify, it'd be nice to try to
give them some sort of warning, but even for them, the release notes
should spell it out clearly.

That business where people would delete files with "log" in the path,
not infrequently on a system running at capacity, isn't just
theoretical.  I've seen people lose data permanently that way, and I
know I'm not the only one who's seen this in the real world.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Fabien COELHO



DECLARE @var EXTERNAL


I do not know what you mean by 'EXTERNAL'.


it means "used as shared in session"


Shared by whom? There is no such thing in the proposal I have made on the 
wiki or in mails. In the proposal variables are private to the role which 
creates them.



It is possible to find "some" typos, depending on the code: you can check
that a variable is both assigned and used somewhere, otherwise it is very
probably a typo. Perl does that *statically*, before executing a script.


"Before execution" .. I am speaking "without execution".


Before execution is also without execution. You can run "perl -c" to get 
the warning.



theoretically, if you check all possible functions, you can find some
issues - but you have to check all function on server.


Yes, sure. It seems to be the same with your proposal: if a hidden 
function drops and recreates a session variable with a different type, or 
changes its permission, then some static checks are void as well, that is 
life. Also, a SQL function may access and modify the variables 
unexpectedly, that would be missed by a PL/pgSQL analysis tool... There is 
no miracle.


Basically, there may be issues even if static analysis tools says that all 
is well.



Elsewhere you cannot to find typo in DECLARE statement.


Indeed, probably there exists some class of typos that may not be found by 
some static analysis implementations on PL/pgSQL functions which uses 
basic session variables.


By the way, are you planing to contribute to the wiki?

https://wiki.postgresql.org/wiki/Variable_Design

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken atomics code on PPC with FreeBSD 10.3

2016-12-30 Thread Tom Lane
Andres Freund  writes:
> On December 30, 2016 4:48:22 PM GMT+01:00, Tom Lane  
> wrote:
>> and got no warnings and the attached output.  I'm not very good at
>> reading PPC assembler, but I think what is happening in the "char" case is 
>> that
>> gcc is trying to emulate a byte-wide operation using a word-wide one,
>> ie an lwarx/stwcx. loop. 

> Hm.  This seems to suggest a straight out code generation bug in that 
> compiler, not a failure in intrinsic detection.

It does smell like a codegen bug; I'm planning to file a FreeBSD bug
report once I have more data (like whether 11.0 is busted similarly).

But that doesn't really detract from my point, which is that it's
totally silly for us to imagine that char and word-wide atomic ops are
interchangeable on all platforms and we can flip a coin to decide which
to use as long as the configure probes don't fail outright.  Even given
working code for the byte case, we ought to prefer int atomics on PPC.
On other platforms, maybe the preference goes the other way.  I'd be
inclined to follow the hard-won knowledge embedded in s_lock.h about
which to prefer.

Or in words of one syllable: this isn't the first bug we've seen in
compiler intrinsics, and it won't be the last.  We need a way to deal
with that, not just claim it isn't our problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken atomics code on PPC with FreeBSD 10.3

2016-12-30 Thread Andres Freund


On December 30, 2016 4:48:22 PM GMT+01:00, Tom Lane  wrote:
>and got no warnings and the attached output.  I'm not very good at
>reading
>PPC assembler, but I think what is happening in the "char" case is that
>gcc is trying to emulate a byte-wide operation using a word-wide one,
>ie an lwarx/stwcx. loop. 

Hm.  This seems to suggest a straight out code generation bug in that compiler, 
not a failure in intrinsic detection.

I'll note that there's certainly ppc64 machine with that intrinsic working 
(tested that on the community hydra during atomics development).  So either 
it's a bug specific to some compiler version,  or 32bit ppc.

I assume there's no trivial way to get a newer compiler on that machine?

 Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-30 15:34 GMT+01:00 Fabien COELHO :

>
> Hello again,
>
> So any dynamic created object and related operations are not checkable by
>>>
 plpgsql_check (or any tool).

>>>
>>> NO.  Your first sentence does not imply this very general statement.
>>>
>>
>> If you have not unique definition, you cannot to it. There is not
>> possibility different between typo and decision. We are talking about
>> static analyze - so code should not be executed - and you don't know what
>> function will be started first.
>>
>
> Yes, I assure you that I really know how static analysis works... All the
> properties I described below may be proved without executing the code, that
> was my point...
>
>
> Some things that I think can be statically proved within a session, that
>>> would cover some security concerns:
>>>
>>> (1) For statically named private dynamic variables declared/used at
>>> different points it can be checked without relying on the function order
>>> that all declarations are consistent, i.e. the same type, same default
>>> value if any.
>>>
>>
>> what is "static" private dynamic variable ? You are using new terminology.
>>
>
> They are my session variable, I just spelled out some properties to be
> precise about what I am stating, otherwise it is a mess. The name of the
> variable is "static" (statically-named), i.e. it is known directly in the
> code. However the variable creation and value are "dynamic".
>
> Static variables like Clang static variables are not usable - you would to
>> share content between different functions.
>>
>
> Sure. I mean static as in "static analysis", i.e. by looking at the code
> without executing it, as you put it.
>
> (2) (3) (4) [...]
>>>
>> You are speaking about runtime check.
>>
>
> Not really, I was speaking about properties statically provable, which I
> understood was your concern. Now the properties proved may imply a runtime
> assumption, for instance that the code has executed without error up to
> some point in the program, which is basically impossible to prove
> statically.
>
> BEGIN
>>  RAISE NOTICE '%', @var;
>> END;
>>
>> Without "execution", you cannot to say if usage of @var is correct or not
>>
>
> It depends about your definition of "correct".
>
> For this very instance it would not matter: if the variable was not
> created beforehand, then an error is raised because it does not exist, if
> it was created before hand, then an error is raised because that is what
> the code is doing... So an error is always raised if the variable is not
> right.
>
> ok, we can use a DECLARE var
>>
>> DECLARE @var EXTERNAL
>>
>
> I do not know what you mean by 'EXTERNAL'.


it means "used as shared in session"


>
>
> BEGIN
>>  RAISE NOTICE '%', @var;
>> END;
>>
>> ok, I can do static check - but
>>
>
> 1. anytime I have to repeat DECLARE statement
>>
>
> Yes, twice in the complete use case: one for the function which checks the
> credentials, one for the getter function.
>
> 2. there is not possible to find typo in DECLARE statement
>>
>
> It is possible to find "some" typos, depending on the code: you can check
> that a variable is both assigned and used somewhere, otherwise it is very
> probably a typo. Perl does that *statically*, before executing a script.


"Before execution" .. I am speaking "without execution".

theoretically, if you check all possible functions, you can find some
issues - but you have to check all function on server. Elsewhere you cannot
to find typo in DECLARE statement.

Regards

Pavel


>
>
> --
> Fabien.
>


Re: [HACKERS] Broken atomics code on PPC with FreeBSD 10.3

2016-12-30 Thread Andres Freund


On December 30, 2016 4:48:22 PM GMT+01:00, Tom Lane  wrote:
>Andres Freund  writes:
>> On 2016-12-30 00:44:33 -0500, Tom Lane wrote:
>>> Perhaps it could be argued that there's a FreeBSD compiler bug here,
>>> but what I'm wondering is why configure believes that this:
>>> 
>>> [char lock = 0;
>>> __sync_lock_test_and_set(, 1);
>>> __sync_lock_release();])],
>>> 
>>> is going to draw a hard error on platforms without byte-wide
>atomics.
>>> My sense of C semantics is that the best you could hope for is a
>>> warning
>
>> Well, in theory these aren't actual functions but intrinsics with
>> special behaviour ("by being overloaded so that they work on multiple
>> types."). The compiler is supposed to warn and link to an external
>> function if a certain operation is not available:
>
>Yeah, I read that.  But configure tests don't normally notice warnings.

IIRC those functions are *not* supposed to be provided, I.e. they should result 
in a linker error.

BTW, it's unclear why there's no 1 byte atomics support in that compiler,  
isn't it? On a ll/sc arch like ppc it should (and IIRC is) be supported 
efficiently. As you suggest, using a lwarx should just work.


> I'm not very good at reading
PPC assembler, but I think what is happening in the "char" case is that
gcc is trying to emulate a byte-wide operation using a word-wide one,
ie an lwarx/stwcx. loop. Even if that works (and apparently it does not),
we would not want to use it since it implies contention between adjacent
atomic flags.

That's the case as long as they're on a single cache line. Whether it's the 
same word, double/half word shouldn't matter.


Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [HACKERS] Logical Replication WIP

2016-12-30 Thread Erik Rijkers

On 2016-12-30 11:53, Petr Jelinek wrote:

I rebased this for the changes made to inheritance and merged in the



0002-Add-SUBSCRIPTION-catalog-and-DDL-v16.patch.gz (~31 KB)


couple of orthography errors in messages


--- ./src/backend/commands/subscriptioncmds.c.orig	2016-12-30 16:44:31.957298438 +0100
+++ ./src/backend/commands/subscriptioncmds.c	2016-12-30 16:47:00.848418044 +0100
@@ -585,7 +585,7 @@
 
 	if (!walrcv_command(wrconn, cmd.data, ))
 		ereport(ERROR,
-(errmsg("count not drop the replication slot \"%s\" on publisher",
+(errmsg("could not drop the replication slot \"%s\" on publisher",
 		slotname),
  errdetail("The error was: %s", err)));
 	else
@@ -623,7 +623,7 @@
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 		  errmsg("permission denied to change owner of subscription \"%s\"",
  NameStr(form->subname)),
-			 errhint("The owner of an subscription must be a superuser.")));
+			 errhint("The owner of a subscription must be a superuser.")));
 
 	form->subowner = newOwnerId;
 	simple_heap_update(rel, >t_self, tup);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-12-30 Thread Erik Rijkers

On 2016-12-30 11:53, Petr Jelinek wrote:

I rebased this for the changes made to inheritance and merged in the



0002-Add-SUBSCRIPTION-catalog-and-DDL-v16.patch.gz (~31 KB)







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken atomics code on PPC with FreeBSD 10.3

2016-12-30 Thread Tom Lane
Andres Freund  writes:
> On 2016-12-30 00:44:33 -0500, Tom Lane wrote:
>> Perhaps it could be argued that there's a FreeBSD compiler bug here,
>> but what I'm wondering is why configure believes that this:
>> 
>> [char lock = 0;
>> __sync_lock_test_and_set(, 1);
>> __sync_lock_release();])],
>> 
>> is going to draw a hard error on platforms without byte-wide atomics.
>> My sense of C semantics is that the best you could hope for is a
>> warning

> Well, in theory these aren't actual functions but intrinsics with
> special behaviour ("by being overloaded so that they work on multiple
> types."). The compiler is supposed to warn and link to an external
> function if a certain operation is not available:

Yeah, I read that.  But configure tests don't normally notice warnings.
Therefore, even if the compiler is behaving fully per spec (and I think
FreeBSD's might not be), we are going to think that char and word-wide
operations are interchangeable even when one is an efficient, in-line
operation and the other is inefficiently emulated by a function or
kernel trap.  This doesn't really seem good enough, especially for
standard architectures where we know darn well what ought to happen.

There's a reason why we've not converted s_lock.h to rely on compiler
intrinsics everywhere, and this is exactly it: the quality of the
intrinsics implementations are horridly variable.

> So that assumption made in that configure test doesn't seem that
> unreasonable.  What assembler do byte-wide atomics generate on that
> platform?  Which compiler/version is this?

$ gcc -v
Using built-in specs.
Target: powerpc-undermydesk-freebsd
Configured with: FreeBSD/powerpc system compiler
Thread model: posix
gcc version 4.2.1 20070831 patched [FreeBSD]

I tried "gcc -Wall -O0 -g -S bogus.c" on this input:

int
test_tas_char(volatile char *ptr)
{
  return __sync_lock_test_and_set(ptr, 1) == 0;
}

int
test_tas_int(volatile int *ptr)
{
  return __sync_lock_test_and_set(ptr, 1) == 0;
}

and got no warnings and the attached output.  I'm not very good at reading
PPC assembler, but I think what is happening in the "char" case is that
gcc is trying to emulate a byte-wide operation using a word-wide one,
ie an lwarx/stwcx. loop.  Even if that works (and apparently it does not),
we would not want to use it since it implies contention between adjacent
atomic flags.

BTW, I looked around to see why the PPC critters in the buildfarm aren't
failing, and the answer is that they all have compilers that are too old
to have __sync_lock_test_and_set at all, so that the configure probes
just fail outright.  But the probes can't tell the difference between
implementations we want to use and implementations we don't.

regards, tom lane

	.file	"bogus.c"
	.section	.debug_abbrev,"",@progbits
.Ldebug_abbrev0:
	.section	.debug_info,"",@progbits
.Ldebug_info0:
	.section	.debug_line,"",@progbits
.Ldebug_line0:
	.section	".text"
.Ltext0:
	.align 2
	.globl test_tas_char
	.type	test_tas_char, @function
test_tas_char:
.LFB2:
	.file 1 "bogus.c"
	.loc 1 3 0
	stwu 1,-32(1)
.LCFI0:
	stw 31,28(1)
.LCFI1:
	mr 31,1
.LCFI2:
	stw 3,8(31)
	.loc 1 4 0
	lwz 7,8(31)
	stw 7,12(31)
	lwz 8,12(31)
	lbz 8,0(8)
	stb 8,17(31)
.L2:
	lbz 9,17(31)
	stb 9,16(31)
	li 11,1
	lwz 10,12(31)
	rlwinm 9,10,3,27,28
	xori 9,9,24
	lbz 0,16(31)
	rlwinm 10,0,0,24,31
	slw 10,10,9
	rlwinm 0,11,0,0xff
	rlwinm 11,0,0,24,31
	slw 11,11,9
	li 0,255
	slw 0,0,9
	lwz 7,12(31)
	rlwinm 9,7,0,0,29
	sync
.L4:
	lwarx 8,0,9
	and 7,8,0
	cmpw 0,7,10
	bne- 0,.L5
	andc 8,8,0
	or 8,8,11
	stwcx. 8,0,9
	bne- 0,.L4
	isync
.L5:
	mr 0,7
	stb 0,17(31)
	lbz 9,17(31)
	lbz 0,16(31)
	cmpw 7,9,0
	bne 7,.L2
	lbz 0,16(31)
	cmpwi 7,0,0
	mfcr 0
	rlwinm 0,0,31,1
	.loc 1 5 0
	mr 3,0
	lwz 11,0(1)
	lwz 31,-4(11)
	mr 1,11
	blr
.LFE2:
	.size	test_tas_char,.-test_tas_char
	.align 2
	.globl test_tas_int
	.type	test_tas_int, @function
test_tas_int:
.LFB3:
	.loc 1 9 0
	stwu 1,-32(1)
.LCFI3:
	stw 31,28(1)
.LCFI4:
	mr 31,1
.LCFI5:
	stw 3,8(31)
	.loc 1 10 0
	lwz 9,8(31)
	li 10,1
	sync
.L8:
	lwarx 0,0,9
	mr 11,10
	stwcx. 11,0,9
	bne- 0,.L8
	isync
	cmpwi 7,0,0
	mfcr 0
	rlwinm 0,0,31,1
	.loc 1 11 0
	mr 3,0
	lwz 11,0(1)
	lwz 31,-4(11)
	mr 1,11
	blr
.LFE3:
	.size	test_tas_int,.-test_tas_int
	.section	.debug_frame,"",@progbits
.Lframe0:
	.4byte	.LECIE0-.LSCIE0
.LSCIE0:
	.4byte	0x
	.byte	0x1
	.string	""
	.uleb128 0x1
	.sleb128 -4
	.byte	0x6c
	.byte	0xc
	.uleb128 0x1
	.uleb128 0x0
	.align 2
.LECIE0:
.LSFDE0:
	.4byte	.LEFDE0-.LASFDE0
.LASFDE0:
	.4byte	.Lframe0
	.4byte	.LFB2
	.4byte	.LFE2-.LFB2
	.byte	0x4
	.4byte	.LCFI0-.LFB2
	.byte	0xe
	.uleb128 0x20
	.byte	0x4
	.4byte	.LCFI1-.LCFI0
	.byte	0x9f
	.uleb128 0x1
	.byte	0x4
	.4byte	.LCFI2-.LCFI1
	.byte	0xd
	.uleb128 0x1f
	.align 2
.LEFDE0:
.LSFDE2:
	.4byte	.LEFDE2-.LASFDE2
.LASFDE2:
	.4byte	.Lframe0
	.4byte	.LFB3
	.4byte	.LFE3-.LFB3
	.byte	0x4
	.4byte	.LCFI3-.LFB3
	.byte	0xe
	.uleb128 0x20
	.byte	0x4
	.4byte	.LCFI4-.LCFI3
	.byte	0x9f
	.uleb128 0x1
	.byte	0x4
	

Re: [HACKERS] Add doc advice about systemd RemoveIPC

2016-12-30 Thread Craig Ringer
On 30 December 2016 at 16:59, Magnus Hagander  wrote:
> On Wed, Dec 28, 2016 at 4:34 AM, Peter Eisentraut
>  wrote:
>>
>> Here is a patch to add some information about the systemd RemoveIPC
>> issue to the documentation, sort of in the spirit of the OOM discussion
>> nearby.
>
>
> I wonder if I missed part of the discussions around this, so maybe my
> understanding of the cases where this occurs is wrong, but isn't it the case
> of pretty much all (or actually) all the packaged versions of postgresql out
> there (debian, redhat etc) that they do the right thing, as in that they
> create "postgres" as a system user?

Yes.

The postgres docs do tend to ignore the reality of most actual
postgres users, though, and talk as if you installed it from source
code under your own user account. I see people bewildered by this
regularly, since we have no discussion at all of common things like
"sudo -u postgres psql" on default packaged installs. Sure, there are
many platforms, but still.



> I like the text in general, but if the above is true, then I think we should
> put a note at the beginning of it with something along the line (not using
> those words) of "if you have installed postgresql using packages, the
> packager should have taken care of this already"? So as not to scare people
> unnecessarily?

You need to have not only installed it with packages, but be running
it under the package-provided postgres user account. This is not
always the case. I see installs from packages that are then manually
initdb'd in /srv/wtf/why all the time, sadly, and often launched by
manual pg_ctl invocations under surprising user accounts.

"If you have installed postgres from distribution or
postgresql.org-provided packages and use the scripts or commands
provided by the packages to start and stop PostgreSQL, this issue is
unlikely to affect you."

?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Dec 30, 2016 at 8:08 PM, Vladimir Rusinov  wrote:
> > Now, I'm not sure whether it is worth maintaining function aliases. Assuming
> > these are indeed trivial (can somebody point me to example?) I see roughly
> > the same amount of downsides both ways.
> > Having aliases raises additional questions:
> > - do we keep them documented (probably not?)
> > - do we keep them forever or kill in some future version?
> 
> The idea here is to keep documented only the new function names, but
> mention in the release notes that aliases are kept, and that those
> will be dropped in a couple of years (see for example 5d58c07a for
> initdb). This will give plenty of time to monitoring script
> maintainers to adapt to the new world order.

I don't particularly like this.  Undocumented aliases that make things
keep working are bound to just confuse people who are looking at what's
happening (eg: logging queries) and wondering "what the heck is that
function?!" and then if they're unable to find it in the docs then they
might think it's some local function that they can remove or similar.

Worse, those function aliases will probably continue to persist for
forever because no one can agree on removing them.

We maintain back-branches for years to provide people time to adjust
their code and whatever else they need to, no one is going to be forced
to move to 10 for quite a few years yet.

Additionally, people who are actually using these bits of the system are
almost certainly going to have to adjust things for the directory
change, having to change other bits of related code nearby at the same
time is much better than someone changing just the directory now and
then having to remmeber/realize that they have to change the function
calls in a few years or "whenever the PG people decide to remove the
function aliases."

Let's make this a clean break/change.  Perhaps it will also encourage
people who have their own hacked together scripts to consider using a
well maintained alternative solution.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Fabien COELHO


Hello again,


So any dynamic created object and related operations are not checkable by

plpgsql_check (or any tool).


NO.  Your first sentence does not imply this very general statement.


If you have not unique definition, you cannot to it. There is not 
possibility different between typo and decision. We are talking about 
static analyze - so code should not be executed - and you don't know 
what function will be started first.


Yes, I assure you that I really know how static analysis works... All the 
properties I described below may be proved without executing the code, 
that was my point...




Some things that I think can be statically proved within a session, that
would cover some security concerns:

(1) For statically named private dynamic variables declared/used at
different points it can be checked without relying on the function order
that all declarations are consistent, i.e. the same type, same default
value if any.


what is "static" private dynamic variable ? You are using new terminology.


They are my session variable, I just spelled out some properties to be 
precise about what I am stating, otherwise it is a mess. The name of the 
variable is "static" (statically-named), i.e. it is known directly in the 
code. However the variable creation and value are "dynamic".



Static variables like Clang static variables are not usable - you would to
share content between different functions.


Sure. I mean static as in "static analysis", i.e. by looking at the code 
without executing it, as you put it.



(2) (3) (4) [...]

You are speaking about runtime check.


Not really, I was speaking about properties statically provable, which I 
understood was your concern. Now the properties proved may imply a runtime 
assumption, for instance that the code has executed without error up to 
some point in the program, which is basically impossible to prove 
statically.



BEGIN
 RAISE NOTICE '%', @var;
END;

Without "execution", you cannot to say if usage of @var is correct or not


It depends about your definition of "correct".

For this very instance it would not matter: if the variable was not 
created beforehand, then an error is raised because it does not exist, if 
it was created before hand, then an error is raised because that is what 
the code is doing... So an error is always raised if the variable is not 
right.



ok, we can use a DECLARE var

DECLARE @var EXTERNAL


I do not know what you mean by 'EXTERNAL'.


BEGIN
 RAISE NOTICE '%', @var;
END;

ok, I can do static check - but



1. anytime I have to repeat DECLARE statement


Yes, twice in the complete use case: one for the function which checks the 
credentials, one for the getter function.



2. there is not possible to find typo in DECLARE statement


It is possible to find "some" typos, depending on the code: you can check 
that a variable is both assigned and used somewhere, otherwise it is very 
probably a typo. Perl does that *statically*, before executing a script.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-30 12:01 GMT+01:00 Pavel Stehule :

>
>
> 2016-12-30 10:29 GMT+01:00 Craig Ringer :
>
>> On 30 December 2016 at 16:46, Fabien COELHO  wrote:
>> >
>> >> Pavel's personal requirements include that it be well suited for
>> >> static analysis of plpgsql using his plpgsql_check tool. So he wants
>> >> persistent definitions.
>> >
>> >
>> > I've been in static analysis for the last 25 years, and the logic of
>> this
>> > statement fails me.
>>
>> I have no opinion here, as I've not seen plpgsql_check nor do I
>> understand the issues Pavel perceives with having dynamic definitions
>> of variables.
>>
>> All I'm saying is that you two are talking around in circles by
>> repeating different requirements to each other, and it's not going to
>> get anywhere unless you both change your approach. It sounds like
>> you're already trying to do that.
>>
>> > I do not think that a feature should be designed around the current
>> > limitations of a particular external tool, esp. if said tool can be
>> improved
>> > at a reasonable cost.
>>
>> Not arguing there.
>>
>> I was initially inclined to favour Pavel's proposal because it fits a
>> RLS use case I was somewhat interested in. But so would dynamic
>> variables resolved at runtime so long as they were fast.
>>
>> Personally I don't much care what the result is, so long as it can
>> satisfy some kind of reasonable security isolation, such that role A
>> can set it, B can read it but not set it, and role C can do neither.
>> Preferably without resorting to creating SECURITY DEFINER accessors,
>> since they're messy and slow. Support for data typing would also be
>> nice too.
>>
>> If it doesn't deliver security controls then IMO there's not much
>> advantage over (ab)using GUCs with current_setting(...).
>>
>> Exploring the other areas discussed:
>>
>> Personally I think MVCC, persistent variables are a totally
>> unnecessary idea that solves a problem we don't have. But maybe I
>> don't understand your use cases. I expect anything like that would
>> land up using a pg_catalog relation as a k/v-like store with different
>> columns for different types or something of the like, which is really
>> something the user can do well enough for themselves. I don't see the
>> point at all.
>>
>> Non-MVCC persistent variables would probably be prohibitively
>> expensive to make crash-safe, and seem even more pointless.
>>
>> Now, I can see shared variables whose state is visible across backends
>> but is neither MVCC nor persistent being a fun toy, albeit not one I
>> find likely to be particularly useful personally. But we can probably
>> already do that in extensions, we've got most if not all of the needed
>> infrastructure. Because we're a shared-nothing-by-default system, such
>> variables will probably need shared memory segments that need to be
>> allocated and, if new vars are added or their values grow too much,
>> re-allocated. Plus locks to control access. All of which we can
>> already do. Most of the uses I can think of for such things are met
>> reasonably well by advisory locking already, and I expect most of the
>> rest would be met by autonomous commit, so it feels a bit like a
>> feature looking for a use-case.
>>
>> So  lets take a step back or eight and ask "why?"
>>
>>
>> Pavel:
>>
>> * Why is it so necessary for plpgsql variables to be implemented as
>> persistent entities that are in the catalogs in order for you to
>> achieve the static checking you want to? Is this due to limitations of
>> your approach in plpgsql_check, or more fundamental issues? Explain.
>>
>
> There are few reasons:
>
> 1. plpgsql_check cannot to know a order of calls of functions. So any
> dynamic created object and related operations are not checkable by
> plpgsql_check (or any tool). If you create variable in one function, then
> this information is not available in other function.
>
> 2. You can write some hints - like Fabien proposal - it is not vulnerable
> against typo. It is much more safer to have one "created" variable, then
> more "declared" variables and believe so all declarations are valid. The
> created variable is only on one place - you can do simple check if
> reference to variable is valid or not. With query to system catalogue, you
> can get the list of all variables - you can see very simply if some
> variables are redundant, obsolete, wrong.
>
> 3. The catalogue objects allow to create  well granularity of access
> rights. without it, you have to introduce only "PRIVATE" variables - and
> then you have to GRANT rights via security definer functions. There is not
> simple reply to question who can work with this variable, who has access,
> ... you have to check a rights to getter functions. When variables are
> catalogue objects, then this reply is simple. Catalogue objects allows to
> have "declared" security. Without catalogue objects we have to construct
> the security. For me, a 

Re: [HACKERS] Microvacuum support for Hash Index

2016-12-30 Thread Jesper Pedersen

On 11/11/2016 12:11 AM, Ashutosh Sharma wrote:

Hi Jesper,


Some initial comments.

_hash_vacuum_one_page:

+   END_CRIT_SECTION();
+   _hash_chgbufaccess(rel, metabuf, HASH_READ, HASH_NOLOCK);

The _hash_chgbufaccess() needs a comment.

You also need a place where you call pfree for so->killedItems - maybe in
hashkillitems().


Thanks for reviewing this patch. I would like to update you that this
patch has got dependency on a patch for concurrent hash index and WAL
log in hash index. So, till these two patches for hash index are not
stable I won't be able to share you a next version of patch for
supporting microvacuum in hash index.



This can be rebased on the WAL v7 patch [1]. In addition to the previous 
comments you need to take commit 7819ba into account.


[1] 
https://www.postgresql.org/message-id/CAA4eK1%2BdmGNTFMnLO4EbOWJDHUq%3D%2Ba2L8T%3D72ifXeh-Kd8HOsg%40mail.gmail.com


Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-30 14:45 GMT+01:00 Fabien COELHO :

>
> Please Pavel, could you avoid citing a whole long mail just for commenting
> one point?
>
> * Why is it so necessary for plpgsql variables to be implemented as
>>> persistent entities that are in the catalogs in order for you to
>>> achieve the static checking you want to? Is this due to limitations of
>>> your approach in plpgsql_check, or more fundamental issues? Explain.
>>>
>>
>> There are few reasons:
>>
>> 1. plpgsql_check cannot to know a order of calls of functions.
>>
>
> Indeed.
>
> So any dynamic created object and related operations are not checkable by
>> plpgsql_check (or any tool).
>>
>
> NO.  Your first sentence does not imply this very general statement.
>

If you have not unique definition, you cannot to it. There is not
possibility different between typo and decision. We are talking about
static analyze - so code should not be executed - and you don't know what
function will be started first.


>
> Some things that I think can be statically proved within a session, that
> would cover some security concerns:
>
> (1) For statically named private dynamic variables declared/used at
> different points it can be checked without relying on the function order
> that all declarations are consistent, i.e. the same type, same default
> value if any.
>

what is "static" private dynamic variable ? You are using new terminology.
Static variables like Clang static variables are not usable - you would to
share content between different functions.


>
> (2) Then the value of the variable is either the default value (eg NULL)
> or the assigned value at points of assignement, which must be a valid value
> for the type, otherwise the assignement would have failed.
>
> (3) If there is only one assignment in the code, then you know that the
> variable can only have been assigned a non default value from this point.
> Probably nice to have in a security context, but it requires you to be
> sure that you have access to all the code...
>
> (4) For a session boolean, then for code "IF @var IS NOT NULL AND NOT @var
> THEN RAISE 'cannot access'; END", in a sequence, then one can prove that
> for any pl code after this point in the sequence @var has been previously
> assigned to true, otherwise the exception would have been raised.
>

You are speaking about runtime check.There is not a problem to define some
rules for runtime check.

What is not possible in your design

1.

BEGIN
  RAISE NOTICE '%', @var;
END;

Without "execution", you cannot to say if usage of @var is correct or not

ok, we can use a DECLARE var

DECLARE @var EXTERNAL
BEGIN
  RAISE NOTICE '%', @var;
END;

ok, I can do static check - but

1. anytime I have to repeat DECLARE statement
2. there is not possible to find typo in DECLARE statement

Regards

Pavel







>
>
> AFAICS, for "static" session variables the only difference is that the
> declaration consistency (1) is slightly more built-in, although you still
> have to check that no function DROP/re-CREATE the session variable with a
> different type, which is quite close to checking (1) for a dynamic session
> variable.
>
> Other properties (2), (3), (4) are exactly the same.
>
> 2. [...] 3.
>>
>
> Please could you put your pros & cons in the wiki as well?
>
> Or don't you want to use it?
>
> --
> Fabien.
>


Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Ashutosh Bapat
>
> Well, flushing the meta-data of pg_twophase is really going to be far
> cheaper than the many pages done until CheckpointTwoPhase is reached.
> There should really be a check on serialized_xacts for the
> non-recovery code path, but considering how cheap that's going to be
> compared to the rest of the restart point stuff it is not worth the
> complexity of adding a counter, for example in shared memory with
> XLogCtl (the counter gets reinitialized at each checkpoint,
> incremented when replaying a 2PC prepare, decremented with a 2PC
> commit). So to reduce the backpatch impact I would just do the fsync
> if (serialized_xact > 0 || RecoveryInProgress()) and call it a day.

Sounds ok to me. I will leave it to the committer to decide further.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 1:30 PM, Ashutosh Bapat
 wrote:
> On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane  wrote:
>> Ashutosh Bapat  writes:
>>> The way this patch has been written, it doesn't allow creating tables
>>> with unknown type columns, which was allowed earlier.
>>
>> Yes, that's an intentional change; creating such tables (or views) has
>> never been anything but a foot-gun.
>>
>> However, I thought the idea was to silently coerce affected columns from
>> unknown to text.  This doesn't look like the behavior we want:
>
> Do you mean to say that when creating a table with a column of unknown
> type, that column type should be silently converted (there's nothing
> to coerce when the table is being created) to text? instead of
> throwing an error?

FWIW that's what I understood: the patch should switch unknown columns
to text. A bunch of side effects when converting types are avoided
this way.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 5:20 PM, Ashutosh Bapat
 wrote:
> As per the prologue of the function, it doesn't expect any 2PC files
> to be written out in the function i.e. between two checkpoints. Most
> of those are created and deleted between two checkpoints. Same would
> be true for recovery as well. Thus in most of the cases we shouldn't
> need to flush the two phase directory in this function whether during
> normal operation or during the recovery. So, we should avoid flushing
> repeatedly when it's not needed. I agree that serialized_xacts > 0 is
> not the right condition during recovery on standby to flush the two
> phase directory.

This is assuming that 2PC transactions are not long-lived, which is
likely true for anything doing sharding, like XC, XL or Citus (?). So
yes that's true to expect that.

> During crash recovery, 2PC files are present on the disk, which means
> the two phase directory has correct record of it. This record can not
> change. So, we shouldn't need to flush it again. If that's true
> serialized_xacts will be 0 during recovery thus serialized_xacts > 0
> condition will still hold.
>
> On a standby however we will have to flush the two phase directory as
> part of checkpoint if there were any files left behind in that
> directory. We need a different condition there.

Well, flushing the meta-data of pg_twophase is really going to be far
cheaper than the many pages done until CheckpointTwoPhase is reached.
There should really be a check on serialized_xacts for the
non-recovery code path, but considering how cheap that's going to be
compared to the rest of the restart point stuff it is not worth the
complexity of adding a counter, for example in shared memory with
XLogCtl (the counter gets reinitialized at each checkpoint,
incremented when replaying a 2PC prepare, decremented with a 2PC
commit). So to reduce the backpatch impact I would just do the fsync
if (serialized_xact > 0 || RecoveryInProgress()) and call it a day.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Fabien COELHO


Please Pavel, could you avoid citing a whole long mail just for commenting 
one point?



* Why is it so necessary for plpgsql variables to be implemented as
persistent entities that are in the catalogs in order for you to
achieve the static checking you want to? Is this due to limitations of
your approach in plpgsql_check, or more fundamental issues? Explain.


There are few reasons:

1. plpgsql_check cannot to know a order of calls of functions.


Indeed.

So any dynamic created object and related operations are not checkable 
by plpgsql_check (or any tool).


NO.  Your first sentence does not imply this very general statement.

Some things that I think can be statically proved within a session, that 
would cover some security concerns:


(1) For statically named private dynamic variables declared/used at 
different points it can be checked without relying on the function order 
that all declarations are consistent, i.e. the same type, same default 
value if any.


(2) Then the value of the variable is either the default value (eg NULL) 
or the assigned value at points of assignement, which must be a valid 
value for the type, otherwise the assignement would have failed.


(3) If there is only one assignment in the code, then you know that the
variable can only have been assigned a non default value from this point.
Probably nice to have in a security context, but it requires you to be 
sure that you have access to all the code...


(4) For a session boolean, then for code "IF @var IS NOT NULL AND NOT @var 
THEN RAISE 'cannot access'; END", in a sequence, then one can prove that 
for any pl code after this point in the sequence @var has been previously 
assigned to true, otherwise the exception would have been raised.



AFAICS, for "static" session variables the only difference is that the 
declaration consistency (1) is slightly more built-in, although you still 
have to check that no function DROP/re-CREATE the session variable with a 
different type, which is quite close to checking (1) for a dynamic session 
variable.


Other properties (2), (3), (4) are exactly the same.


2. [...] 3.


Please could you put your pros & cons in the wiki as well?

Or don't you want to use it?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics (v19)

2016-12-30 Thread Petr Jelinek
On 12/12/16 22:50, Tomas Vondra wrote:
> On 12/12/2016 12:26 PM, Amit Langote wrote:
>>
>> Hi Tomas,
>>
>> On 2016/10/30 4:23, Tomas Vondra wrote:
>>> Hi,
>>>
>>> Attached is v20 of the multivariate statistics patch series, doing
>>> mostly
>>> the changes outlined in the preceding e-mail from October 11.
>>>
>>> The patch series currently has these parts:
>>>
>>> * 0001 : (FIX) teach pull_varno about RestrictInfo
>>> * 0002 : (PATCH) shared infrastructure and ndistinct coefficients

Hi,

I went over these two (IMHO those could easily be considered as minimal
committable set even if the user visible functionality they provide is
rather limited).

> dropping statistics
> ---
> 
> The statistics may be dropped automatically using DROP STATISTICS.
> 
> After ALTER TABLE ... DROP COLUMN, statistics referencing are:
> 
>   (a) dropped, if the statistics would reference only one column
> 
>   (b) retained, but modified on the next ANALYZE

This should be documented in user visible form if you plan to keep it
(it does make sense to me).

> +   therefore perfectly correlated. Providing additional information about
> +   correlation between columns is the purpose of multivariate statistics,
> +   and the rest of this section thoroughly explains how the planner
> +   leverages them to improve estimates.
> +  
> +
> +  
> +   For additional details about multivariate statistics, see
> +   src/backend/utils/mvstats/README.stats. There are additional
> +   READMEs for each type of statistics, mentioned in the 
> following
> +   sections.
> +  
> +
> + 

I don't think this qualifies as "thoroughly explains" ;)

> +
> +Oid
> +get_statistics_oid(List *names, bool missing_ok)

No comment?

> + case OBJECT_STATISTICS:
> + msg = gettext_noop("statistics \"%s\" does not exist, 
> skipping");
> + name = NameListToString(objname);
> + break;

This sounds somewhat weird (plural vs singular).

> + * XXX Maybe this should check for duplicate stats. Although it's not clear
> + * what "duplicate" would mean here (wheter to compare only keys or also
> + * options). Moreover, we don't do such checks for indexes, although those
> + * store tuples and recreating a new index may be a way to fix bloat (which
> + * is a problem statistics don't have).
> + */
> +ObjectAddress
> +CreateStatistics(CreateStatsStmt *stmt)

I don't think we should check duplicates TBH so I would remove the XXX
(also "wheter" is typo but if you remove that paragraph it does not matter).

> + if (true)
> + {

Huh?

> +
> +List *
> +RelationGetMVStatList(Relation relation)
> +{
...
> +
> +void
> +update_mv_stats(Oid mvoid, MVNDistinct ndistinct,
> + int2vector *attrs, VacAttrStats **stats)
...
> +static double
> +ndistinct_for_combination(double totalrows, int numrows, HeapTuple *rows,
> +int2vector *attrs, VacAttrStats **stats,
> +int k, int *combination)
> +{


Again, these deserve comment.

I'll try to look at other patches in the series as time permits.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics (v19)

2016-12-30 Thread Petr Jelinek
On 12/12/16 22:50, Tomas Vondra wrote:
>> +
>> +SELECT pg_mv_stats_dependencies_show(stadeps)
>> +  FROM pg_mv_statistic WHERE staname = 's1';
>> +
>> + pg_mv_stats_dependencies_show
>> +---
>> + (1) => 2, (2) => 1
>> +(1 row)
>> +
>>
>> Couldn't this somehow show actual column names, instead of attribute
>> numbers?
>>
> 
> Yeah, I was thinking about that too. The trouble is that's table-level
> metadata, so we don't have that kind of info serialized within the data
> type (e.g. because it would not handle column renames etc.).
> 
> It might be possible to explicitly pass the table OID as a parameter of
> the function, but it seemed a bit ugly to me.

I think it makes sense to have such function, this is not out function
so I think it's ok for it to have the oid as input, especially since in
the use-case shown above you can use starelid easily.

> 
> FWIW, as I wrote in this thread, the place where this patch series needs
> feedback most desperately is integration into the optimizer. Currently
> all the magic happens in clausesel.c and does not leave it.I think it
> would be good to move some of that (particularly the choice of
> statistics to apply) to an earlier stage, and store the information
> within the plan tree itself, so that it's available outside clausesel.c
> (e.g. for EXPLAIN - showing which stats were picked seems useful).
> 
> I was thinking it might work similarly to the foreign key estimation
> patch (100340e2). It might even be more efficient, as the current code
> may end repeating the selection of statistics multiple times. But
> enriching the plan tree turned out to be way more invasive than I'm
> comfortable with (but maybe that'd be OK).
>

In theory it seems like possibly reasonable approach to me, mainly
because mv statistics are user defined objects. I guess we'd have to see
at least some PoC to see how invasive it is. But I ultimately think that
feedback from a committer who is more familiar with planner is needed here.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-30 11:03 GMT+01:00 Craig Ringer :

> On 30 December 2016 at 17:29, Craig Ringer  wrote:
>
> > So  lets take a step back or eight and ask "why?"
>
> Oh, and speaking of, I see Pavel's approach as looking for a
> PostgreSQL-adapted way to do something like Oracle's PL/SQL package
> variables. Right Pavel?
>

It was main motivation - the question was - how to share (in one session)
secure some information between function calls.

The PostgreSQL is specific in multi language support - but purpose is same.


>
> If so, their properties are, as far as I as a non-Oracle-person can tell:
>
> * Can be package-private or public. If public, can be both got and set
> by anyone. If private, can be got and set directly only by code in
> package. (Our equivalent is "by the owner"). As far as I can tell
> outside access to package-private variables still uses the variable
> get/set syntax, but is automatically proxied via getter/setter methods
> defined in the package, if defined, otherwise inaccessible.
>
> * Value not visible across sessions. Ever.
>
> * Can have an initialiser / DEFAULT value.
>
> * Non-persistent, value lost at session end.
>
> A typical example, where package variables are init'd from a table:
>
> http://www.dba-oracle.com/plsql/t_plsql_global_data.htm
>
> which relies on package initializers, something we don't have (but can
> work around easily enough with a little verbosity).
>
> This shows both public vars and package-private ones.
>
> See also https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/
> constantvar_declaration.htm
>
>
> I believe these package variable properties are the properties Pavel
> seeks to model/emulate. Declared statically, value persistent only
> within the same session, non-transactional, can be private.
>
> Certainly there's nothing here that requires us to allow GRANTs.
> Simple ownership tests would supply us with similar functionality to
> what Oracle users have, allowing for our lack of packages and
> inability to hide the _existence_ of an object, only its contents.
>

The packages has own scope - so any access from packages is allowed. I
cannot do it in Postgres without explicitly written setter/getter
functions. So GRANTS reduces a requirement to write security definer
envelop functions.

Sure - owner doesn't need it. If your application is one user, or if you
are owner, then you don't need to use GRANT.


>
>
> My views:
>
> I am personally overwhelmingly opposed to variables that automagically
> create themselves when dereferenced, a-la Perl. Write $serialised
> (english spelling) not $serialized (US spelling) and you get a silent
> null. Fun! Hell. No. This is why failure to "use strict" in Perl is a
> near-criminal offense.
>
> I'd also strongly prefer to require vars to be declared before first
> use. Again, like "use strict", and consistent with how Pg behaves
> elsewhere. Otherwise we need some kind of magic syntax to say "this is
> a variable", plus vars that get created on first assignment suck
> almost as badly as ones that're null on undefined deference. Spend
> half an hour debugging and figure out that you typo'd an assignment.
> Again, "use strict".
>
> I fail to see any real utility to cross-session vars, persistent or
> otherwise, at this point. Use a normal or unlogged relation.
>
> I don't see the point of untyped variables with no ownership or rights
> controls. (ab)use a GUC. Note that you can achieve both xact-scoped
> and session-scoped that way, with xact-scoped vars assigned using SET
> LOCAL being unwound on xact end.
>
> Unless we also propose to add ON CONNECT triggers, I think some kind
> of persistency of declaration is useful but not critical. We'll land
> up with apps sending preambles of declarations on session start
> otherwise. But the most compelling use cases are for things where
> there'll be a procedure invoked by the user or app on connect anyway,
> so it can declare stuff there. I'm utterly unconvinced that it's
> necessary to have them in the catalogs to achieve static checking.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Fabien COELHO


Hello Craig,

A long mail with many questions, that I tried to answered clearly, the 
result is too long...



[...] I have no opinion here, as I've not seen plpgsql_check nor do I 
understand the issues Pavel perceives with having dynamic definitions of 
variables.


I understand that Pavel assumes that a static analysis tool cannot take 
into account a dynamic variable, hence is reserve.




All I'm saying is that you two are talking around in circles by
repeating different requirements to each other, and it's not going to
get anywhere unless you both change your approach. It sounds like
you're already trying to do that.


Yep, that is why I have created the wiki page, so at least a argument 
should not be repeated cyclically, it should be written once.



[...] I was initially inclined to favour Pavel's proposal because it 
fits a RLS use case I was somewhat interested in. But so would dynamic 
variables resolved at runtime so long as they were fast.


Fitting the need of use cases is the key point, obviously.

[...] Preferably without resorting to creating SECURITY DEFINER 
accessors, since they're messy and slow.


I'm not sure what you mean about "messy", but if you can objectivate this 
it can be an argument. Please feel free to explain it on the wiki.


As for "slow", I have just tested overheads with pgbench, comparing a 
direct arithmetic operation (as a proxy to a fast session variable 
consultation) to constant returning plpgsql functions with security 
definer and security invoker, on a direct socket connection, with prepared 
statements:


  select 1 + 0: 0.020 ms
  select one_sd() : 0.024 ms
  select one_si() : 0.024 ms

I do not think that there is a significant "slowness" issue with using 
function calls.




Exploring the other areas discussed:

Personally I think MVCC, persistent variables are a totally unnecessary 
[...] But maybe I don't understand your use cases.


I've done a survey about the schema of projects based on databases, mysql 
or pgsql. A significant number of them use a common pattern based on a 
one-row table, essentially to hold some scalar information about the 
application version and facilitate upgrades.


However the one-row property is just hoped for, and on principle a 
database is about declaring constraints that are enforced afterwards.


I see two clean solutions to this use case: declaring tables as one row, 
or having scalar objects.




Now, I can see shared variables whose state is visible across backends
but is neither MVCC nor persistent being a fun toy, albeit not one I
find likely to be particularly useful personally.


Yep, I'm skeptical as well. I would like to see a convincing use case.



Pavel:

* Why is it so necessary for plpgsql variables to be implemented as
persistent entities that are in the catalogs in order for you to
achieve the static checking you want to? Is this due to limitations of
your approach in plpgsql_check, or more fundamental issues? Explain.


Note about this question not addressed to me: currently "plpgsql_check" 
cannot analyze any session variables as no such concept exists, whether 
with or without persistent declarations.




Fabien:

* What use do you have for persistent-data variables? Please set out
some use cases where they solve problems that are currently hard to to
solve or greatly improve on the existing solutions.


It is about the often seen one-row pattern, that I think should be 
enforced either with some singleton table declaration, or scalar objects.




* On what basis do you _oppose_ persistently defining variables in the
catalogs as their own entities?


In understand that you are speaking of "persistent session variables".

For me a database is about persistence (metadata & data) with safety 
(transactions) and security (permissions)... and maybe performance:-)


Pavel's proposal creates a new object with 2 (secure metadata-persistence) 
out of 4 properties... I'm not a ease with introducting a new 
half-database concept in a database. I could accept it for a convincing 
use case that absolutely require that for deep reasons.


On the other hand there are dynamic session variables (mysql, mssql, 
oracle have some variants) which are useful on their own without 
pretending to be database objects (no CREATE/ALTER/DROP, GRANT/REVOKE). If 
I can make these to handle the special case and avoid a new special 
half-database concept, I would prefer it.


The key point about all that is to discuss, understand and evaluate the 
involved use cases.



(My own objection is that "temporary variables" would make our existing 
catalog bloat issues for temp objects even worse).


I do agree that inefficient temporary variables are worthless, but ISTM 
that Pavel's proposal is not exactly about temporary variables, it is 
about temporary-valued permanent-variables. So there is no temporary (on 
the fly) variable as such, and if it is extended for this purpose then 
indeed the catalog costs look expensive.



* Do 

Re: [HACKERS] pg_dump versus rules, once again

2016-12-30 Thread Benedikt Grundmann
On 30 December 2016 at 11:58, Benedikt Grundmann 
wrote:

>
> On 17 November 2016 at 03:45, Robert Haas  wrote:
>
>> On Wed, Nov 16, 2016 at 10:14 PM, Tom Lane  wrote:
>> > Robert Haas  writes:
>> >> On Wed, Nov 16, 2016 at 10:00 PM, Tom Lane  wrote:
>> >>> The changes in pg_backup_archiver.c would have to be back-patched
>> >>> into all versions supporting --if-exists, so that they don't fail
>> >>> on dump archives produced by patched versions.
>> >
>> >> Even if you patch future minor releases, past minor releases are still
>> >> going to exist out there in the wild for a long, long time.
>> >
>> > Yeah, but it would only matter if you try to use pg_restore --clean
>> --if-exists
>> > with an archive file that happens to contain a view that has this issue.
>> > Such cases would previously have failed anyway, because of precisely
>> > the bug at issue ... and there aren't very many of them, or we'd have
>> > noticed the problem before.  So I don't feel *too* bad about this,
>> > I just want to make sure we have a solution available.
>>
>> Right, OK.
>>
>
> For what it is worth we just run into this problem on our postgres 9.2.17
> installation on a hunch we (after reading Tom's initial email replaced the
> view that caused this by this
>
> create view ... as select * from (...original view definition...)
> hack_around_pg_dump_versus_rules_bug;
>
> Which caused pg_dump to change its behavior and instead emit create view
>  which is what we wanted (because we take filtered down and dependency
> ordered outputs of pg_dump as the starting point for new patches to the
> db).  But it surprised me mildly that the hack "worked" so I thought I
> would mention it here.   It might just mean that I'm misunderstanding the
> bug but if there was really a dependency in the original that dependency
> still exists now.
>
>
N/m turns out that using pg_dump -t  isn't a good way to test if
the hack works because than it always does the good thing.


>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>


Re: [HACKERS] pg_dump versus rules, once again

2016-12-30 Thread Benedikt Grundmann
On 17 November 2016 at 03:45, Robert Haas  wrote:

> On Wed, Nov 16, 2016 at 10:14 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Wed, Nov 16, 2016 at 10:00 PM, Tom Lane  wrote:
> >>> The changes in pg_backup_archiver.c would have to be back-patched
> >>> into all versions supporting --if-exists, so that they don't fail
> >>> on dump archives produced by patched versions.
> >
> >> Even if you patch future minor releases, past minor releases are still
> >> going to exist out there in the wild for a long, long time.
> >
> > Yeah, but it would only matter if you try to use pg_restore --clean
> --if-exists
> > with an archive file that happens to contain a view that has this issue.
> > Such cases would previously have failed anyway, because of precisely
> > the bug at issue ... and there aren't very many of them, or we'd have
> > noticed the problem before.  So I don't feel *too* bad about this,
> > I just want to make sure we have a solution available.
>
> Right, OK.
>

For what it is worth we just run into this problem on our postgres 9.2.17
installation on a hunch we (after reading Tom's initial email replaced the
view that caused this by this

create view ... as select * from (...original view definition...)
hack_around_pg_dump_versus_rules_bug;

Which caused pg_dump to change its behavior and instead emit create view
 which is what we wanted (because we take filtered down and dependency
ordered outputs of pg_dump as the starting point for new patches to the
db).  But it surprised me mildly that the hack "worked" so I thought I
would mention it here.   It might just mean that I'm misunderstanding the
bug but if there was really a dependency in the original that dependency
still exists now.


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 8:08 PM, Vladimir Rusinov  wrote:
> Now, I'm not sure whether it is worth maintaining function aliases. Assuming
> these are indeed trivial (can somebody point me to example?) I see roughly
> the same amount of downsides both ways.
> Having aliases raises additional questions:
> - do we keep them documented (probably not?)
> - do we keep them forever or kill in some future version?

The idea here is to keep documented only the new function names, but
mention in the release notes that aliases are kept, and that those
will be dropped in a couple of years (see for example 5d58c07a for
initdb). This will give plenty of time to monitoring script
maintainers to adapt to the new world order.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread Michael Paquier
On Fri, Dec 30, 2016 at 2:59 AM, Stephen Frost  wrote:
> All backwards incompatible changes are judgement calls and people are
> certainly welcome to have different opinions.  I have a pretty strong
> feeling about this particular change being worthwhile and also pretty
> long overdue.

The renaming is definitely worth it to be consistent with the new
pg_wal/. Now as in the case of those functions we have ways a simple
way to avoid useless pain to users we should really take it easy. So
introducing a set of aliases and deprecating them after a couple of
years is a fine plan IMO.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken atomics code on PPC with FreeBSD 10.3

2016-12-30 Thread Andres Freund
Hi,

On 2016-12-30 00:44:33 -0500, Tom Lane wrote:
> *** /usr/home/tgl/pgsql/src/test/regress/expected/lock.out  Thu Dec 29 
> 19:37:50 2016
> --- /usr/home/tgl/pgsql/src/test/regress/results/lock.out   Fri Dec 30 
> 00:31:01 2016
> ***
> *** 63,70 
>   -- atomic ops tests
>   RESET search_path;
>   SELECT test_atomic_ops();
> !  test_atomic_ops
> ! -
> !  t
> ! (1 row)
> !
> --- 63,66 
>   -- atomic ops tests
>   RESET search_path;
>   SELECT test_atomic_ops();
> ! ERROR:  flag: set spuriously #2
>
> ==

Hm, not good.


> After some digging I found out that the atomics code appears to believe
> that the PPC platform has byte-wide atomics, which of course is nonsense.
> That causes it to define pg_atomic_flag with the "char" variant, and
> what we end up with is word-wide operations (lwarx and friends) operating
> on a byte-wide struct.  Failure is not exactly astonishing; what's
> surprising is that we don't get stack-clobber core dumps or such.
> Undef'ing HAVE_GCC__SYNC_CHAR_TAS makes it work.
>
> Perhaps it could be argued that there's a FreeBSD compiler bug here,
> but what I'm wondering is why configure believes that this:
>
>   [char lock = 0;
>__sync_lock_test_and_set(, 1);
>__sync_lock_release();])],
>
> is going to draw a hard error on platforms without byte-wide atomics.
> My sense of C semantics is that the best you could hope for is a
> warning

Well, in theory these aren't actual functions but intrinsics with
special behaviour ("by being overloaded so that they work on multiple
types."). The compiler is supposed to warn and link to an external
function if a certain operation is not available:

Not all operations are supported by all target processors. If a
particular operation cannot be implemented on the target processor, a
warning is generated and a call to an external function is
generated. The external function carries the same name as the built-in
version, with an additional suffix ‘_n’ where n is the size of the data
type.

So that assumption made in that configure test doesn't seem that
unreasonable.  What assembler do byte-wide atomics generate on that
platform?  Which compiler/version is this?

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread Vladimir Rusinov
On Thu, Dec 29, 2016 at 5:59 PM, Stephen Frost  wrote:

> I have a pretty strong
> feeling about this particular change being worthwhile and also pretty
> long overdue.
>

Yeah, sorry for that. I should be able to make some progress early January.

-- 
Vladimir Rusinov
Storage SRE, Google Ireland

Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland
Registered in Dublin, Ireland
Registration Number: 368047


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2016-12-30 Thread Vladimir Rusinov
On Thu, Dec 29, 2016 at 5:48 PM, Cynthia Shang <
cynthia.sh...@crunchydata.com> wrote:

> I have never heard of coding standards where naming conventions required a
> function/variable name match a directory or file name. It seems that would
> be restrictive.
>

This is not about coding standard, this is about terminology and common
sense.


>
> I'm not trying to pick a fight, I just think the pros should outweigh the
> cons when choosing a path forward. In this case I see lots of cons and one
> partial pro; partial because renaming only user facing functions and
> documentation will create inconsistency within the Postgres code for all of
> the following. It sounds as if your minds are already made up, which
> saddens me but I will say nothing further on the matter.
>

Ultimately, from user/dba perspective code does not matter. Consistency
does. I found out firsthand that the fact pg_swicth_xlog() does something
in pg_wal directory is rather confusing.

The way I see it decision has been made to rename 'pg_xlog' to 'pg_wal'.
Since this directory name is effectively part of API (e.g. for setups with
xlog on different disk, backup scripts, etc), rest of API either needs to
follow or this decision needs to be revised (but that ship has probably
sailed).
I think small inconvenience of a small group of people (PostgreSQL
developers) is worth the improvement for much larger group of people
(PostgreSQL users).

Now, I'm not sure whether it is worth maintaining function aliases.
Assuming these are indeed trivial (can somebody point me to example?) I see
roughly the same amount of downsides both ways.
Having aliases raises additional questions:
- do we keep them documented (probably not?)
- do we keep them forever or kill in some future version?

-- 
Vladimir Rusinov
Storage SRE, Google Ireland

Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland
Registered in Dublin, Ireland
Registration Number: 368047


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Pavel Stehule
2016-12-30 10:29 GMT+01:00 Craig Ringer :

> On 30 December 2016 at 16:46, Fabien COELHO  wrote:
> >
> >> Pavel's personal requirements include that it be well suited for
> >> static analysis of plpgsql using his plpgsql_check tool. So he wants
> >> persistent definitions.
> >
> >
> > I've been in static analysis for the last 25 years, and the logic of this
> > statement fails me.
>
> I have no opinion here, as I've not seen plpgsql_check nor do I
> understand the issues Pavel perceives with having dynamic definitions
> of variables.
>
> All I'm saying is that you two are talking around in circles by
> repeating different requirements to each other, and it's not going to
> get anywhere unless you both change your approach. It sounds like
> you're already trying to do that.
>
> > I do not think that a feature should be designed around the current
> > limitations of a particular external tool, esp. if said tool can be
> improved
> > at a reasonable cost.
>
> Not arguing there.
>
> I was initially inclined to favour Pavel's proposal because it fits a
> RLS use case I was somewhat interested in. But so would dynamic
> variables resolved at runtime so long as they were fast.
>
> Personally I don't much care what the result is, so long as it can
> satisfy some kind of reasonable security isolation, such that role A
> can set it, B can read it but not set it, and role C can do neither.
> Preferably without resorting to creating SECURITY DEFINER accessors,
> since they're messy and slow. Support for data typing would also be
> nice too.
>
> If it doesn't deliver security controls then IMO there's not much
> advantage over (ab)using GUCs with current_setting(...).
>
> Exploring the other areas discussed:
>
> Personally I think MVCC, persistent variables are a totally
> unnecessary idea that solves a problem we don't have. But maybe I
> don't understand your use cases. I expect anything like that would
> land up using a pg_catalog relation as a k/v-like store with different
> columns for different types or something of the like, which is really
> something the user can do well enough for themselves. I don't see the
> point at all.
>
> Non-MVCC persistent variables would probably be prohibitively
> expensive to make crash-safe, and seem even more pointless.
>
> Now, I can see shared variables whose state is visible across backends
> but is neither MVCC nor persistent being a fun toy, albeit not one I
> find likely to be particularly useful personally. But we can probably
> already do that in extensions, we've got most if not all of the needed
> infrastructure. Because we're a shared-nothing-by-default system, such
> variables will probably need shared memory segments that need to be
> allocated and, if new vars are added or their values grow too much,
> re-allocated. Plus locks to control access. All of which we can
> already do. Most of the uses I can think of for such things are met
> reasonably well by advisory locking already, and I expect most of the
> rest would be met by autonomous commit, so it feels a bit like a
> feature looking for a use-case.
>
> So  lets take a step back or eight and ask "why?"
>
>
> Pavel:
>
> * Why is it so necessary for plpgsql variables to be implemented as
> persistent entities that are in the catalogs in order for you to
> achieve the static checking you want to? Is this due to limitations of
> your approach in plpgsql_check, or more fundamental issues? Explain.
>

There are few reasons:

1. plpgsql_check cannot to know a order of calls of functions. So any
dynamic created object and related operations are not checkable by
plpgsql_check (or any tool). If you create variable in one function, then
this information is not available in other function.

2. You can write some hints - like Fabien proposal - it is not vulnerable
against typo. It is much more safer to have one "created" variable, then
more "declared" variables and believe so all declarations are valid. The
created variable is only on one place - you can do simple check if
reference to variable is valid or not. With query to system catalogue, you
can get the list of all variables - you can see very simply if some
variables are redundant, obsolete, wrong.

3. The catalogue objects allow to create  well granularity of access
rights. without it, you have to introduce only "PRIVATE" variables - and
then you have to GRANT rights via security definer functions. There is not
simple reply to question who can work with this variable, who has access,
... you have to check a rights to getter functions. When variables are
catalogue objects, then this reply is simple. Catalogue objects allows to
have "declared" security. Without catalogue objects we have to construct
the security. For me, a declared security is stronger.

Regards

Pavel


>
> Fabien:
>
> * What use do you have for persistent-data variables? Please set out
> some use cases where they solve problems that are currently hard 

Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Craig Ringer
On 30 December 2016 at 17:29, Craig Ringer  wrote:

> So  lets take a step back or eight and ask "why?"

Oh, and speaking of, I see Pavel's approach as looking for a
PostgreSQL-adapted way to do something like Oracle's PL/SQL package
variables. Right Pavel?

If so, their properties are, as far as I as a non-Oracle-person can tell:

* Can be package-private or public. If public, can be both got and set
by anyone. If private, can be got and set directly only by code in
package. (Our equivalent is "by the owner"). As far as I can tell
outside access to package-private variables still uses the variable
get/set syntax, but is automatically proxied via getter/setter methods
defined in the package, if defined, otherwise inaccessible.

* Value not visible across sessions. Ever.

* Can have an initialiser / DEFAULT value.

* Non-persistent, value lost at session end.

A typical example, where package variables are init'd from a table:

http://www.dba-oracle.com/plsql/t_plsql_global_data.htm

which relies on package initializers, something we don't have (but can
work around easily enough with a little verbosity).

This shows both public vars and package-private ones.

See also 
https://docs.oracle.com/cd/B19306_01/appdev.102/b14261/constantvar_declaration.htm


I believe these package variable properties are the properties Pavel
seeks to model/emulate. Declared statically, value persistent only
within the same session, non-transactional, can be private.

Certainly there's nothing here that requires us to allow GRANTs.
Simple ownership tests would supply us with similar functionality to
what Oracle users have, allowing for our lack of packages and
inability to hide the _existence_ of an object, only its contents.


My views:

I am personally overwhelmingly opposed to variables that automagically
create themselves when dereferenced, a-la Perl. Write $serialised
(english spelling) not $serialized (US spelling) and you get a silent
null. Fun! Hell. No. This is why failure to "use strict" in Perl is a
near-criminal offense.

I'd also strongly prefer to require vars to be declared before first
use. Again, like "use strict", and consistent with how Pg behaves
elsewhere. Otherwise we need some kind of magic syntax to say "this is
a variable", plus vars that get created on first assignment suck
almost as badly as ones that're null on undefined deference. Spend
half an hour debugging and figure out that you typo'd an assignment.
Again, "use strict".

I fail to see any real utility to cross-session vars, persistent or
otherwise, at this point. Use a normal or unlogged relation.

I don't see the point of untyped variables with no ownership or rights
controls. (ab)use a GUC. Note that you can achieve both xact-scoped
and session-scoped that way, with xact-scoped vars assigned using SET
LOCAL being unwound on xact end.

Unless we also propose to add ON CONNECT triggers, I think some kind
of persistency of declaration is useful but not critical. We'll land
up with apps sending preambles of declarations on session start
otherwise. But the most compelling use cases are for things where
there'll be a procedure invoked by the user or app on connect anyway,
so it can declare stuff there. I'm utterly unconvinced that it's
necessary to have them in the catalogs to achieve static checking.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Craig Ringer
On 30 December 2016 at 16:46, Fabien COELHO  wrote:
>
>> Pavel's personal requirements include that it be well suited for
>> static analysis of plpgsql using his plpgsql_check tool. So he wants
>> persistent definitions.
>
>
> I've been in static analysis for the last 25 years, and the logic of this
> statement fails me.

I have no opinion here, as I've not seen plpgsql_check nor do I
understand the issues Pavel perceives with having dynamic definitions
of variables.

All I'm saying is that you two are talking around in circles by
repeating different requirements to each other, and it's not going to
get anywhere unless you both change your approach. It sounds like
you're already trying to do that.

> I do not think that a feature should be designed around the current
> limitations of a particular external tool, esp. if said tool can be improved
> at a reasonable cost.

Not arguing there.

I was initially inclined to favour Pavel's proposal because it fits a
RLS use case I was somewhat interested in. But so would dynamic
variables resolved at runtime so long as they were fast.

Personally I don't much care what the result is, so long as it can
satisfy some kind of reasonable security isolation, such that role A
can set it, B can read it but not set it, and role C can do neither.
Preferably without resorting to creating SECURITY DEFINER accessors,
since they're messy and slow. Support for data typing would also be
nice too.

If it doesn't deliver security controls then IMO there's not much
advantage over (ab)using GUCs with current_setting(...).

Exploring the other areas discussed:

Personally I think MVCC, persistent variables are a totally
unnecessary idea that solves a problem we don't have. But maybe I
don't understand your use cases. I expect anything like that would
land up using a pg_catalog relation as a k/v-like store with different
columns for different types or something of the like, which is really
something the user can do well enough for themselves. I don't see the
point at all.

Non-MVCC persistent variables would probably be prohibitively
expensive to make crash-safe, and seem even more pointless.

Now, I can see shared variables whose state is visible across backends
but is neither MVCC nor persistent being a fun toy, albeit not one I
find likely to be particularly useful personally. But we can probably
already do that in extensions, we've got most if not all of the needed
infrastructure. Because we're a shared-nothing-by-default system, such
variables will probably need shared memory segments that need to be
allocated and, if new vars are added or their values grow too much,
re-allocated. Plus locks to control access. All of which we can
already do. Most of the uses I can think of for such things are met
reasonably well by advisory locking already, and I expect most of the
rest would be met by autonomous commit, so it feels a bit like a
feature looking for a use-case.

So  lets take a step back or eight and ask "why?"


Pavel:

* Why is it so necessary for plpgsql variables to be implemented as
persistent entities that are in the catalogs in order for you to
achieve the static checking you want to? Is this due to limitations of
your approach in plpgsql_check, or more fundamental issues? Explain.


Fabien:

* What use do you have for persistent-data variables? Please set out
some use cases where they solve problems that are currently hard to to
solve or greatly improve on the existing solutions.

* On what basis do you _oppose_ persistently defining variables in the
catalogs as their own entities? (My own objection is that "temporary
variables" would make our existing catalog bloat issues for temp
objects even worse).

* Do you expect temporary/transient session variables to come into
existence when first set, or to require some form of explicit
definition?


Everyone:

* Does anyone care about or want variables whose value is shared
across sessions? If so, why? Set out use cases.

* Does anyone care about or want variables whose value becomes visible
as soon as set, i.e. non-MVCC? If so, why? Set out use cases.

* Does anyone care about or want variables whose value is persistent
on-disk across restarts and/or crashes, maybe recorded in WAL for
replication, etc? If so, justify how this is better than a relation in
real-world practical terms.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add doc advice about systemd RemoveIPC

2016-12-30 Thread Magnus Hagander
On Wed, Dec 28, 2016 at 4:34 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> Here is a patch to add some information about the systemd RemoveIPC
> issue to the documentation, sort of in the spirit of the OOM discussion
> nearby.
>

I wonder if I missed part of the discussions around this, so maybe my
understanding of the cases where this occurs is wrong, but isn't it the
case of pretty much all (or actually) all the packaged versions of
postgresql out there (debian, redhat etc) that they do the right thing, as
in that they create "postgres" as a system user?

I like the text in general, but if the above is true, then I think we
should put a note at the beginning of it with something along the line (not
using those words) of "if you have installed postgresql using packages, the
packager should have taken care of this already"? So as not to scare people
unnecessarily?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Fabien COELHO



Pavel's personal requirements include that it be well suited for
static analysis of plpgsql using his plpgsql_check tool. So he wants
persistent definitions.


I've been in static analysis for the last 25 years, and the logic of this 
statement fails me.


Pavel wants that the plpgsql_check tool can statically analyse session 
variables, which is fine with me.


It does not fellow that the definition must be "persistent" as such, it 
fellows that it should be available in the PL/pgSQL script so that a tool 
which reads it can check it by looking at the codes. IMO this is a lesser 
requirement.


I do not think that a feature should be designed around the current 
limitations of a particular external tool, esp. if said tool can be 
improved at a reasonable cost.



I think progress can only be achieved by setting out requirements, a
feature matrix, and which proposed implementations can deliver which
desired features. Then go from there.


Yep. I've bootstapped a wiki page, feel free to improve it as you see fit:

https://wiki.postgresql.org/wiki/Variable_Design

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential data loss of 2PC files

2016-12-30 Thread Ashutosh Bapat
On Fri, Dec 30, 2016 at 11:22 AM, Michael Paquier
 wrote:
> On Thu, Dec 29, 2016 at 6:41 PM, Ashutosh Bapat
>  wrote:
>> I agree with this.
>> If no prepared transactions were required to be fsynced
>> CheckPointTwoPhase(), do we want to still fsync the directory?
>> Probably not.
>>
>> May be you want to call fsync_fname(TWOPHASE_DIR, true); if
>> serialized_xacts > 0.
>
> Definitely true for the non-recovery code path. But not for restart
> points as there is no GXACT entry created by the redo routine of 2PC
> prepare. We could have a static counter tracking how many 2PC files
> have been flushed since last restart point or not but I am not
> convinced if that's worth the facility.

As per the prologue of the function, it doesn't expect any 2PC files
to be written out in the function i.e. between two checkpoints. Most
of those are created and deleted between two checkpoints. Same would
be true for recovery as well. Thus in most of the cases we shouldn't
need to flush the two phase directory in this function whether during
normal operation or during the recovery. So, we should avoid flushing
repeatedly when it's not needed. I agree that serialized_xacts > 0 is
not the right condition during recovery on standby to flush the two
phase directory.

During crash recovery, 2PC files are present on the disk, which means
the two phase directory has correct record of it. This record can not
change. So, we shouldn't need to flush it again. If that's true
serialized_xacts will be 0 during recovery thus serialized_xacts > 0
condition will still hold.

On a standby however we will have to flush the two phase directory as
part of checkpoint if there were any files left behind in that
directory. We need a different condition there.

>
> That's not true for recovery. So I could go for something like that:
> "If any 2PC files have been flushed, do the same for the parent
> directory to make this information durable on disk. On recovery, issue
> the fsync() anyway, individual 2PC files have already been flushed whe
> replaying their respective XLOG_XACT_PREPARE record.
>

We need to specify recovery (log replay) on standby specifically, I guess.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2016-12-30 Thread Craig Ringer
On 30 December 2016 at 14:50, Fabien COELHO  wrote:
>
>>> I know this one. It can be empty, which a singleton cannot be. For a
>>> singleton table, you should have one and only one row, you cannot insert or
>>> delete, so this is only part of the real thing.
>>
>>
>> Surely we can do a bit better than that, if that's what you really want.
>> Create the table with an initial row and add a trigger preventing anything
>> except update.
>
>
> Yes.
>
> I just meant that this is not available easily "out of the box", but it is
> obviously doable with some effort... which people would seldom do.


Sure... but every feature has a cost too, in maintenance and reliability.

This needs to have compelling use cases, and it's not free to add
multiple similar-ish/overlapping features. So some agreement on what
we should actually have is needed, along with a compelling case for
what it delivers that we can't already do.

Pavel's personal requirements include that it be well suited for
static analysis of plpgsql using his plpgsql_check tool. So he wants
persistent definitions.

You expect different things from variables, to the point where it's a
different feature.

It's unlikely two unrelated variable implementations will be accepted.

I think progress can only be achieved by setting out requirements, a
feature matrix, and which proposed implementations can deliver which
desired features. Then go from there.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers