Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-07 Thread Amit Langote
On 2017/09/08 14:47, Amit Langote wrote:
> When I tried the attached patch, it doesn't seem to expand partitioning
> inheritance in step-wise manner as the patch's title says.

Oops.  By "attached patch", I had meant to say the Robert's patch, that
is, expand-stepwise-rmh.patch.  Not expand-stepwise-rmh-2.patch, which is
the updated version of the patch attached with the quoted message.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-07 Thread Amit Langote
On 2017/09/08 4:04, Robert Haas wrote:
> On Tue, Sep 5, 2017 at 7:01 AM, Ashutosh Bapat
>  wrote:
>> accumulate_append_subpath() is executed for every path instead of
>> every relation, so changing it would collect the same list multiple
>> times. Instead, I found the old way of associating all intermediate
>> partitions with the root partitioned relation work better. Here's the
>> updated patch set.
> 
> When I tried out patch 0001, it crashed repeatedly during 'make check'
> because of an assertion failure in get_partitioned_child_rels.  It
> seemed to me that the way the patch was refactoring
> expand_inherited_rtentry involved more code rearrangement than
> necessary, so I reverted all the code rearrangement and just kept the
> functional changes, and all the crashes went away.  (That refactoring
> also failed to initialize has_child properly.)

When I tried the attached patch, it doesn't seem to expand partitioning
inheritance in step-wise manner as the patch's title says.  I think the
rewritten patch forgot to include Ashutosh's changes to
expand_single_inheritance_child() whereby the AppendRelInfo of the child
will be marked with the direct parent instead of always the root parent.

I updated the patch to include just those changes.  I'm not sure about
one of the Ashutosh's changes whereby the child PlanRowMark is also passed
to expand_partitioned_rtentry() to use as the parent PlanRowMark.  I think
the child RTE, child RT index and child Relation are fine, because they
are necessary for creating AppendRelInfos in a desired way for later
planning steps.  But PlanRowMarks are not processed within the planner
afterwards and do not need to be marked with the immediate parent-child
association in the same way that AppendRelInfos need to be.

I also included the changes to add_paths_to_append_rel() from my patch on
the "path toward faster partition pruning" thread.  We'd need that change,
because while add_paths_to_append_rel() is called for all partitioned
table RTEs in a given partition tree, expand_inherited_rtentry() would
have set up a PartitionedChildRelInfo only for the root parent, so
get_partitioned_child_rels() would not find the same for non-root
partitioned table rels and crash failing the Assert.  The changes I made
are such that we call get_partitioned_child_rels() only for the parent
rels that are known to correspond root partitioned tables (or as you
pointed out on the thread, "the table named in the query" as opposed those
added to the query as result of inheritance expansion).  In addition to
the relkind check on the input RTE, it would seem that checking that the
reloptkind is RELOPT_BASEREL would be enough.  But actually not, because
if a partitioned table is accessed in a UNION ALL query, reloptkind even
for the root partitioned table (the table named in the query) would be
RELOPT_OTHER_MEMBER_REL.  The only way to confirm that the input rel is
actually the root partitioned table is to check whether its parent rel is
not RTE_RELATION, because the parent in case of UNION ALL Append is a
RTE_SUBQUERY RT entry.

> One thing I notice is that if I rip out the changes to initsplan.c,
> the new regression test still passes.  If it's possible to write a
> test that fails without those changes, I think it would be a good idea
> to include one in the patch.  That's certainly one of the subtler
> parts of this patch, IMHO.

Back when this (step-wise expansion of partition inheritance) used to be a
patch in the original declarative partitioning patch series, Ashutosh had
reported a test query [1] that would fail getting a plan, for which we
came up with the initsplan.c changes in this patch as the solution:

ERROR:  could not devise a query plan for the given query

I tried that query again without the initsplan.c changes and somehow the
same error does not occur anymore.  It's strange because without the
initsplan.c changes, there is no way for partitions lower in the tree than
the first level to get the direct_lateral_relids and lateral_relids from
the root parent rel.   Maybe, Ashutosh has a way to devise the failing
query again.


I also confirmed that the partition-pruning patch set works fine with this
patch instead of the patch on that thread with the same functionality,
which I will now drop from that patch set.  Sorry about the wasted time.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpReZF34MDbY95xoATi0xVj2mAry4-LHBWVBayOc8gj%3Diqg%40mail.gmail.com

diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 2d7e1d84d0..71b5bdf95e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -24,6 +24,7 @@
 #include "catalog/pg_operator.h"
 #include "catalog/pg_proc.h"
 #include "foreign/fdwapi.h"
+#include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #ifdef OPTIMIZER_DEBUG
@@ -867,6 +868,9 @@ set_append_rel_size(PlannerInfo *root, 

Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-07 Thread Fabien COELHO


Hello,


  PSQL_HISTORY   alternative location for the command history file

I would prefer to revert to that more compact 9.6-formatting.


There was a problem with line width .. its hard to respect 80 chars


Yes.

Scrolling in two dimensions because it does not fit either way is not too 
practical, so the idea was the it should at least fit a reasonable 
terminal in the horizontal dimension, the vertical one having been 
unfittable anyway for a long time.


Once you want to do strict 80 columns, a lot of/most descriptions do not 
fit and need to be split somehow on two lines, one way or another. It 
seemed that


  XXX
   xxx xx xxx xxx 

Is a good way to do that systematically and with giving more space and 
chance for a description to fit in its line. ISTM that it was already done 
like for environment variables, so it is also for homogeneity.


It also simplify work for translators that can just follow the same 
formatting and know what to do if a one line English explanation does

not fit once translated.

Finally, as vertical scrolling is mandatory, I would be fine with skipping 
lines with entries for readability, but it is just a matter of taste and I 
expect there should be half a dozen different opinions on the matter of 
formatting.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel worker error

2017-09-07 Thread Amit Kapila
On Fri, Sep 8, 2017 at 3:18 AM, Robert Haas  wrote:
> On Mon, Sep 4, 2017 at 5:46 AM, Amit Kapila  wrote:
>> It seems like the consensus is to move forward with this approach.  I
>> have written a patch implementing the above idea.  Note, that to use
>> SetCurrentRoleId, we need the value of guc "is_superuser" for the
>> current user and we don't pass this value to parallel workers as this
>> is PGC_INTERNAL guc variable.  So, I have passed this value via
>> FixedParallelState.
>
>
> With your patch, the order is wrong.  SetCurrentRoleId() is called
> AFTER SetUserIdAndSecContext().  Furthermore, the role ID passed to
> SetCurrentRoleId() is always the same one passed to
> SetUserIdAndSecContext().  But those roles might be different.
> fps->current_user_id is set by a call to GetUserIdAndSecContext(),
> which reads CurrentUserId, not OuterUserId.  I think you need to pass
> the value returned by GetCurrentRoleId() as an additional element of
> FixedParallelState.
>

You are right.  I have changed the ordering and passed OuterUserId via
FixedParallelState.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_role_handling_parallel_worker_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-07 Thread Pavel Stehule
2017-09-08 6:36 GMT+02:00 Erik Rijkers :

> On 2017-09-08 06:09, Pavel Stehule wrote:
>
>> Hi
>>
>> Now the output looks like:
>>
>>   AUTOCOMMIT
>> if set, successful SQL commands are automatically committed
>>   COMP_KEYWORD_CASE
>> determines the case used to complete SQL key words
>> [lower, upper, preserve-lower, preserve-upper]
>>   DBNAME
>> the currently connected database name
>>
> [...]
>
>> What do you think about using new line between entries in this format?
>>
>>   AUTOCOMMIT
>> if set, successful SQL commands are automatically committed
>>
>>   COMP_KEYWORD_CASE
>> determines the case used to complete SQL key words
>> [lower, upper, preserve-lower, preserve-upper]
>>
>>   DBNAME
>> the currently connected database name
>>
>>
> I dislike it, it takes more screen space and leads to unneccessary
> scroll-need.
>
> The 9.6.5 formatting is/was:
>
>   AUTOCOMMIT if set, successful SQL commands are automatically
> committed
>   COMP_KEYWORD_CASE  determines the case used to complete SQL key words
>  [lower, upper, preserve-lower, preserve-upper]
>   DBNAME the currently connected database name
> [...]
>   PGPASSWORD connection password (not recommended)
>   PGPASSFILE password file name
>   PSQL_EDITOR, EDITOR, VISUAL
>  editor used by the \e, \ef, and \ev commands
>   PSQL_EDITOR_LINENUMBER_ARG
>  how to specify a line number when invoking the editor
>   PSQL_HISTORY   alternative location for the command history file
>
> I would prefer to revert to that more compact 9.6-formatting.


There was a problem with line width .. its hard to respect 80 chars


>
>
>
> Erik Rijkers
>


Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-07 Thread Erik Rijkers

On 2017-09-08 06:09, Pavel Stehule wrote:

Hi

Now the output looks like:

  AUTOCOMMIT
if set, successful SQL commands are automatically committed
  COMP_KEYWORD_CASE
determines the case used to complete SQL key words
[lower, upper, preserve-lower, preserve-upper]
  DBNAME
the currently connected database name

[...]

What do you think about using new line between entries in this format?

  AUTOCOMMIT
if set, successful SQL commands are automatically committed

  COMP_KEYWORD_CASE
determines the case used to complete SQL key words
[lower, upper, preserve-lower, preserve-upper]

  DBNAME
the currently connected database name



I dislike it, it takes more screen space and leads to unneccessary 
scroll-need.


The 9.6.5 formatting is/was:

  AUTOCOMMIT if set, successful SQL commands are automatically 
committed

  COMP_KEYWORD_CASE  determines the case used to complete SQL key words
 [lower, upper, preserve-lower, preserve-upper]
  DBNAME the currently connected database name
[...]
  PGPASSWORD connection password (not recommended)
  PGPASSFILE password file name
  PSQL_EDITOR, EDITOR, VISUAL
 editor used by the \e, \ef, and \ev commands
  PSQL_EDITOR_LINENUMBER_ARG
 how to specify a line number when invoking the 
editor

  PSQL_HISTORY   alternative location for the command history file

I would prefer to revert to that more compact 9.6-formatting.


Erik Rijkers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-09-07 Thread Pavel Stehule
2017-08-16 14:06 GMT+02:00 Pavel Stehule :

> Hi
>
> 2017-08-15 4:37 GMT+02:00 Peter Eisentraut  com>:
>
>> On 3/11/17 07:06, Pavel Stehule wrote:
>> > I am sending a updated version with separated sort direction in special
>> > variable
>>
>> This patch also needs a rebase.
>>
>
> I am sending rebased patch
>

rebased again + fix obsolete help

Regards

Pavel



>
> Regards
>
> Pavel
>
>
>>
>> --
>> Peter Eisentraut  http://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 79468a5663..d51c4bf900 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3796,6 +3796,27 @@ bar
   
 
   
+VERBOSE_SORT_COLUMNS
+
+
+This variable can be set to the values schema_name,
+name_schema or size to control the 
+order of content of decrible command.
+
+
+  
+
+  
+VERBOSE_SORT_DIRECTION
+
+
+This variable can be set to the values asc,
+or desc to control the order of content of decrible command.
+
+
+  
+
+  
 VERBOSITY
 
 
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6fb9bdd063..3ead55856d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -200,6 +200,8 @@ describeAccessMethods(const char *pattern, bool verbose)
 	return true;
 }
 
+#define SORT_DIRECTION_STR(v)		((v) == PSQL_SORT_ASC ? "ASC" : "DESC")
+
 /*
  * \db
  * Takes an optional regexp to select particular tablespaces
@@ -268,7 +270,18 @@ describeTablespaces(const char *pattern, bool verbose)
 		  NULL, "spcname", NULL,
 		  NULL);
 
-	appendPQExpBufferStr(, "ORDER BY 1;");
+
+	if (verbose && pset.sversion >= 90200)
+	{
+		if (pset.verbose_sort_columns == PSQL_SORT_SIZE)
+			appendPQExpBuffer(,
+			  "ORDER BY pg_catalog.pg_tablespace_size(oid) %s, 1;",
+			  SORT_DIRECTION_STR(pset.verbose_sort_direction));
+		else
+			appendPQExpBufferStr(, "ORDER BY 1;");
+	}
+	else
+		appendPQExpBufferStr(, "ORDER BY 1;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
@@ -830,7 +843,19 @@ listAllDbs(const char *pattern, bool verbose)
 		processSQLNamePattern(pset.db, , pattern, false, false,
 			  NULL, "d.datname", NULL, NULL);
 
-	appendPQExpBufferStr(, "ORDER BY 1;");
+	if (verbose && pset.sversion >= 80200)
+	{
+		if (pset.verbose_sort_columns == PSQL_SORT_SIZE)
+			appendPQExpBuffer(,
+		  "ORDER BY CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')\n"
+		  "  THEN pg_catalog.pg_database_size(d.datname) END %s, 1;\n",
+		  SORT_DIRECTION_STR(pset.verbose_sort_direction));
+		else
+			appendPQExpBufferStr(, "ORDER BY 1");
+	}
+	else
+		appendPQExpBufferStr(, "ORDER BY 1");
+
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
 	if (!res)
@@ -3424,7 +3449,26 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 		  "n.nspname", "c.relname", NULL,
 		  "pg_catalog.pg_table_is_visible(c.oid)");
 
-	appendPQExpBufferStr(, "ORDER BY 1,2;");
+	if (verbose && pset.sversion >= 80100)
+	{
+		if (pset.verbose_sort_columns == PSQL_SORT_SCHEMA_NAME)
+			appendPQExpBufferStr(, "ORDER BY 1,2;");
+		else if (pset.verbose_sort_columns == PSQL_SORT_NAME_SCHEMA)
+			appendPQExpBufferStr(, "ORDER BY 2,1;");
+		else
+		{
+			if (pset.sversion >= 9)
+appendPQExpBuffer(,
+	"ORDER BY pg_catalog.pg_table_size(c.oid) %s, 1,2",
+	SORT_DIRECTION_STR(pset.verbose_sort_direction));
+			else
+appendPQExpBuffer(,
+	"ORDER BY pg_catalog.pg_relation_size(c.oid) %s, 1,2",
+	SORT_DIRECTION_STR(pset.verbose_sort_direction));
+		}
+	}
+	else
+		appendPQExpBufferStr(, "ORDER BY 1,2;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 4d1c0ec3c6..a28fe07aa2 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -337,7 +337,7 @@ helpVariables(unsigned short int pager)
 	 * Windows builds currently print one more line than non-Windows builds.
 	 * Using the larger number is fine.
 	 */
-	output = PageOutput(147, pager ? &(pset.popt.topt) : NULL);
+	output = PageOutput(151, pager ? &(pset.popt.topt) : NULL);
 
 	fprintf(output, _("List of specially treated variables\n\n"));
 
@@ -401,6 +401,10 @@ helpVariables(unsigned short int pager)
 	  "the currently connected database user\n"));
 	fprintf(output, _("  VERBOSITY\n"
 	  "controls verbosity of error reports [default, verbose, terse]\n"));
+	fprintf(output, _("  VERBOSE_SORT_COLUMNS\n"
+	  "controls sort of result in verbose mode [schema_name, name_schema, size]\n"));
+	fprintf(output, _("  VERBOSE_SORT_DIRECTION\n"
+	  "controls direction of order of result in verbose mode [asc, desc]\n"));
 	

[HACKERS] psql: new help related to variables are not too readable

2017-09-07 Thread Pavel Stehule
Hi

Now the output looks like:

  AUTOCOMMIT
if set, successful SQL commands are automatically committed
  COMP_KEYWORD_CASE
determines the case used to complete SQL key words
[lower, upper, preserve-lower, preserve-upper]
  DBNAME
the currently connected database name
  ECHO
controls what input is written to standard output
[all, errors, none, queries]
  ECHO_HIDDEN
if set, display internal queries executed by backslash commands;
if set to "noexec", just show them without execution
  ENCODING
current client character set encoding
  FETCH_COUNT
the number of result rows to fetch and display at a time (0 = unlimited)
  HISTCONTROL
controls command history [ignorespace, ignoredups, ignoreboth]
  HISTFILE
file name used to store the command history
  HISTSIZE
max number of commands to store in the command history
  HOST

What do you think about using new line between entries in this format?

  AUTOCOMMIT
if set, successful SQL commands are automatically committed

  COMP_KEYWORD_CASE
determines the case used to complete SQL key words
[lower, upper, preserve-lower, preserve-upper]

  DBNAME
the currently connected database name

  ECHO
controls what input is written to standard output
[all, errors, none, queries]

  ECHO_HIDDEN
if set, display internal queries executed by backslash commands;
if set to "noexec", just show them without execution

  ENCODING
current client character set encoding

  FETCH_COUNT
the number of result rows to fetch and display at a time (0 = unlimited)

  HISTCONTROL
controls command history [ignorespace, ignoredups, ignoreboth]

  HISTFILE
file name used to store the command history

  HISTSIZE
max number of commands to store in the command history

Regards

Pavel


Re: [HACKERS] Tuple-routing for certain partitioned tables not working as expected

2017-09-07 Thread Etsuro Fujita

On 2017/09/08 0:21, Robert Haas wrote:

On Thu, Sep 7, 2017 at 5:13 AM, Etsuro Fujita
 wrote:

On 2017/08/30 17:20, Etsuro Fujita wrote:

Here is a patch to skip the CheckValidResultRel checks for a target table
if it's a foreign partition to perform tuple-routing for, as proposed by
Robert.


In the patch, to skip the checks, I passed to CheckValidResultRel a new flag
indicating whether the target relation is a partition to do tuple-routing
for, but while updating another patch for tuple-routing for foreign
partitions in PG11, I noticed that it would be better to pass ResultRelInfo
than that flag.  The reason is: (1) we can see whether the relation is such
a partition by looking at the ResultRelInfo's ri_PartitionRoot, instead of
that flag, and (2) this is for tuple-routing for foreign partitions, but we
could extend that function with that argument to do the checks without
throwing an error and save the result in a new member of ResultRelInfo (eg,
ri_PartitionIsValid) so that we can use that info to determine in ExecInsert
whether a foreign partition chosen by ExecFindPartition is valid for
tuple-routing.  So, I updated the patch that way.  Attached is an updated
version of the patch.


OK, committed to master and v10.  I am less convinced than you that
this was a must-fix issue, but it's not a very invasive change the way
you did it here.


Thank you!

I'll update the tuple-routing-for-foreign-partitions patch that way.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-09-07 Thread Tom Lane
Andreas Karlsson  writes:
> On 09/07/2017 11:34 PM, Tomas Vondra wrote:
>> Well, people won't be able to set the inactive options, just like you
>> can't set ssl=on when you build without OpenSSL support. But perhaps we
>> could simply not include the inactive options into the config file, no?

> Yeah, I have been thinking about how bad it would be to dynamically 
> generate the config file. I think I will try this.

I'm not exactly convinced that dynamically inserting some parameters
and not others is a great idea.  Remember that people tend to copy
postgresql.conf files forward from one installation to another.  Or
they might decide to rebuild the postmaster for an existing installation
with a different SSL library.  In any scenario like that, you've not
done them any real favor if the config file they have contains no trace
of the SSL parameters they need.

I think we might be best off just playing it straight and providing
a config file that contains a section along these lines:

# Parameters for OpenSSL.  Leave these commented out if not using OpenSSL.
#
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
#ssl_prefer_server_ciphers = on
#ssl_ecdh_curve = 'prime256v1'
#ssl_dh_params_file = ''
#ssl_cert_file = 'server.crt'
#ssl_key_file = 'server.key'
#ssl_ca_file = ''
#ssl_crl_file = ''
#
# Parameters for GnuTLS.  Leave these commented out if not using GnuTLS.
#
#gnufoo=...
#gnubar=...
#
# Parameters for macOS TLS.  ... you get the idea.

As previously noted, it'd be a good idea to rename the existing
ssl_xxx parameters to openssl_xxx, except maybe ones that we think
will be universal.  (But even if we do think that, it might be
simpler in the long run to just have three or four totally independent
sections of the config file, instead of some common and some library-
specific parameters.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-09-07 Thread Masahiko Sawada
On Fri, Sep 8, 2017 at 7:24 AM, Thomas Munro
 wrote:
> On Wed, Aug 16, 2017 at 2:13 PM, Masahiko Sawada  
> wrote:
>> The previous patch conflicts with current HEAD, I rebased the patch to
>> current HEAD.
>
> Hi Masahiko-san,
>
> FYI this doesn't build anymore.  I think it's just because the wait
> event enumerators were re-alphabetised in pgstat.h:
>
> ../../../../src/include/pgstat.h:820:2: error: redeclaration of
> enumerator ‘WAIT_EVENT_LOGICAL_SYNC_DATA’
>   WAIT_EVENT_LOGICAL_SYNC_DATA,
>   ^
> ../../../../src/include/pgstat.h:806:2: note: previous definition of
> ‘WAIT_EVENT_LOGICAL_SYNC_DATA’ was here
>   WAIT_EVENT_LOGICAL_SYNC_DATA,
>   ^
> ../../../../src/include/pgstat.h:821:2: error: redeclaration of
> enumerator ‘WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE’
>   WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE,
>   ^
> ../../../../src/include/pgstat.h:807:2: note: previous definition of
> ‘WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE’ was here
>   WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE,
>   ^
>

Thank you for the information! Attached rebased patch.

-- 
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Moving_extension_lock_out_of_heavyweight_lock_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-09-07 Thread Andreas Karlsson

On 09/07/2017 11:34 PM, Tomas Vondra wrote:

I am worried about having 3x version of TLS controls in
postgresql.conf, and only one set being active. Perhaps we need to
break out the TLS config to separate files or something. Anyway, this
needs more thought.


Well, people won't be able to set the inactive options, just like you
can't set ssl=on when you build without OpenSSL support. But perhaps we
could simply not include the inactive options into the config file, no?


Yeah, I have been thinking about how bad it would be to dynamically 
generate the config file. I think I will try this.


Daniel: What options does Secure Transport need for configuring ciphers, 
ECDH, and cipher preference? Does it need any extra options (I think I 
saw something about the keychain)?


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Tom Lane
Simon Riggs  writes:
> On 7 September 2017 at 11:31, Tom Lane  wrote:
>> Haas' idea of some kind of syntactic extension, like "LET guc1 = x,
>> guc2 = y FOR statement" seems more feasible to me.  I'm not necessarily
>> wedded to that particular syntax, but I think it has to look like
>> a single-statement construct of some kind.

> Always happy to use a good idea... (any better way to re-locate that
> discussion?)

https://www.postgresql.org/message-id/ca+tgmobgd_uzrs44couty1odnbr0c_hjsxvx_dmrevz-cwu...@mail.gmail.com

> Requires a new GUC mode for "statement local" rather than "transaction local"

Yeah, something along that line.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Tom Lane
Simon Riggs  writes:
> On 7 September 2017 at 11:24, Tom Lane  wrote:
>> Not hearing anything, I already pushed my patch an hour or three ago.

> Yes, I saw. Are you saying that doc commit is all we need? ISTM we
> still had an actual bug.

The originally reported bug is fixed.  Not making any claims about
other bugs ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-09-07 Thread Robert Haas
On Thu, Sep 7, 2017 at 8:13 AM, Jeevan Ladhe
 wrote:
> The fix would be much easier if the refactoring patch 0001 by Amul in hash
> partitioning thread[2] is committed.

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-09-07 Thread Robert Haas
On Mon, Sep 4, 2017 at 6:38 AM, amul sul  wrote:
> I've updated patch to use an extended hash function (Commit #
> 81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Committed 0001 after noticing that Jeevan Ladhe also found that change
convenient for default partitioning.  I made a few minor cleanups;
hopefully I didn't break anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-09-07 Thread Masahiko Sawada
On Fri, Sep 8, 2017 at 8:25 AM, Thomas Munro
 wrote:
> On Fri, Sep 8, 2017 at 10:24 AM, Thomas Munro
>  wrote:
>> On Wed, Aug 16, 2017 at 2:13 PM, Masahiko Sawada  
>> wrote:
>>> The previous patch conflicts with current HEAD, I rebased the patch to
>>> current HEAD.
>>
>> Hi Masahiko-san,
>
> Hi Sawada-san,
>
> I have just learned from a colleague who is knowledgeable about
> Japanese customs and kind enough to correct me that the appropriate
> term of address for our colleagues in Japan on this mailing list is
> -san.  I was confused about that -- apologies for my
> clumsiness.

Don't worry about it, either is ok. In Japan there is a custom of
writing -san but -san is also not incorrect :-)
(also I think it's hard to distinguish between last name and first
name of Japanese name).

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index

2017-09-07 Thread Thomas Munro
Hi Shubham,

On Tue, Jun 27, 2017 at 9:21 PM, Shubham Barai  wrote:
> If these two hash keys (78988658 and 546789888) mapped to the same bucket, 
> this will result in false serialization failure.
> Please correct me if this assumption about false positives is wrong.

I wonder if there is an opportunity to use computed hash values
directly in predicate lock tags, rather than doing it on the basis of
buckets.  Perhaps I'm missing something important about the way that
locks need to escalate that would prevent that from working.

> 3) tested my patch on the current head

This no longer applies, but it's in "Needs review" status in the
Commitfest.  Could you please post a rebased version?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-09-07 Thread Masahiko Sawada
On Thu, Sep 7, 2017 at 4:15 PM, Fabien COELHO  wrote:
>
>>> Very very minor comments that I should have noticed before, sorry for
>>> this
>>> additional round trip.
>>
>>
>> Thank you for the dedicated review!
>
>
> I'm someone at times pigheaded, I think in the good sense if it is possible,
> and I like to finish what I start:-)
>
> Patch applies, compiles, works, everything is fine from my point of view.
>
> I switched it to "Ready for Committer".

Thanks.

>
> Again, if the pgbench tap test patch get through, it should be tap tested.
>

Thank you for the remainder, I'll add tap tests once the patch got committed.

-- 
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Simon Riggs
On 7 September 2017 at 11:31, Tom Lane  wrote:
> Simon Riggs  writes:
>> I would like to relax the restriction to allow this specific use case...
>>   SET work_mem = X; SET max_parallel_workers = 4; SELECT ...
>> so we still have only one command (the last select), yet we have
>> multiple GUC settings beforehand.
>
> On what basis do you claim that's only one command?  It would return
> multiple CommandCompletes, for starters, so that it breaks the protocol
> just as effectively as any other loosening.
>
> Moreover, I imagine the semantics you really want is that the SETs only
> apply for the duration of the command.  This wouldn't provide that
> result either.

> Haas' idea of some kind of syntactic extension, like "LET guc1 = x,
> guc2 = y FOR statement" seems more feasible to me.  I'm not necessarily
> wedded to that particular syntax, but I think it has to look like
> a single-statement construct of some kind.

Always happy to use a good idea... (any better way to re-locate that
discussion?)

1. Allow SET to set multiple parameters...
SET guc1 = x, guc2 = y
This looks fairly straightforward

2. Allow SET to work only for a single command...
SET guc1 = x, guc2 = y FOR query
Don't see anything too bad about that...
Requires a new GUC mode for "statement local" rather than "transaction local"

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Simon Riggs
On 7 September 2017 at 11:24, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 5 September 2017 at 10:22, Tom Lane  wrote:
>>> Does anyone want to do further review on this patch?  If so, I'll
>>> set the CF entry back to "Needs Review".
>
>> OK, I'll review Michael's patch (and confirm my patch is dead)
>
> Not hearing anything, I already pushed my patch an hour or three ago.

Yes, I saw. Are you saying that doc commit is all we need? ISTM we
still had an actual bug.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2017-09-07 Thread Thomas Munro
On Mon, Aug 21, 2017 at 4:35 PM, Haribabu Kommi
 wrote:
> On Tue, Aug 15, 2017 at 7:29 AM, Peter Eisentraut
>  wrote:
>> On 4/4/17 01:06, Haribabu Kommi wrote:
>> > Both pg_dump and pg_upgrade tests are passed. Updated patch attached
>> > I will add this patch to the next commitfest.
>>
>> This patch needs to be rebased for the upcoming commit fest.
>
> Thanks for checking. Rebased patch is attached.

Hi Haribabu,

This patch breaks the documentation build, possibly because of these empty tags:

+--create option to dump <>ALTER ROLE IN
DATABASE ... SET

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Pattern based listeners for asynchronous messaging (LISTEN/NOTIFY)

2017-09-07 Thread Marko Tiikkaja
Hi Markus,

On Sun, Aug 20, 2017 at 9:56 PM, Markus Sintonen 
wrote:

> I also encountered this when I built it with different configuration. I
> attached updated patch with the correct number of arguments to
> 'similar_escape'. I also added preliminary documentation to the patch.
> (Unfortunately unable to currently compile the documentation for testing
> purpose on Windows probably because of commit https://github.com/post
> gres/postgres/commit/510074f9f0131a04322d6a3d2a51c87e6db243f9. I followed
> https://www.postgresql.org/docs/devel/static/install-windows
> -full.html#idm45412738673840.)
>
> What do you think about the syntax? There was a suggestion to specify type
> of the pattern (eg ltree extension) but to me this feels like a overkill.
> One option here would be eg:
> LISTEN PATTERN 'foo%' TYPE 'similar'
> LISTEN PATTERN 'foo.*' TYPE 'ltree'
>

Not that my opinion matters, but I think we should pick one pattern style
and stick to it.  SIMILAR TO doesn't seem like the worst choice.  ltree
seems useless.

As for the rest of the interface..

First, I think mixing patterns and non-patterns is weird.  This is apparent
in at least two cases:

marko=# listen "foo%";
LISTEN
marko=# listen similar to 'foo%';
LISTEN
marko=# select * from pg_listening_channels();
 pg_listening_channels
---
 foo%
(1 row)

-- Not actually listening on the pattern.  Confusion.

The second case being the way UNLISTEN can be used to unlisten patterns,
too.  It kind of makes sense given that you can't really end up with both a
channel name and a pattern with the same source string, but it's still
weird.  I think it would be much better to keep these completely separate
so that you could be listening on both the channel "foo%" and the pattern
'foo%' at the same time, and you'd use a bare UNLISTEN to unsubscribe from
the former, and UNLISTEN SIMILAR TO for the latter.  As you said in the
original email:

> Allow *UNLISTEN SIMILAR TO 'xxx'* which would unregister matching
listeners. To me this feels confusing therefore it is not in the patch.

I agree, starting to match the patterns themselves would be confusing.  So
I think we should use that syntax for unsubscribing from patterns.  But
others should feel free to express their opinions on this.

Secondly -- and this is a kind of continuation to my previous point of
conflating patterns and non-patterns -- I don't think you can get away with
not changing the interface for pg_listening_channels().  Not knowing which
ones are compared byte-for-byte and which ones are patterns just seems
weird.


As for the patch itself, I have a couple of comments.  I'm writing this
based on the latest commit in your git branch, commit
fded070f2a56024f931b9a0f174320eebc362458.

In queue_listen(), the new variables would be better declared at the
innermost scope possible.  The datum is only used if isSimilarToPattern is
true, errormsg only if compile_regex didn't return REG_OKAY, etc..

I found this comment confusing at first: "If compiled RE was not applied as
a listener then it is freed at transaction commit."  The past tense makes
it seem like something that has already happened when that code runs, when
in reality it happens later in the transaction.

I'm not a fan of the dance you're doing with pcompreg.  I think it would be
better to optimistically allocate the ListenAction struct and compile
directly into actrec->compiledRegex.

The changed DEBUG1 line in Async_Listen should include whether it's a
pattern or not.

I don't understand why the return value of Exec_UnlistenAllCommit() was
changed at all.  Why do we need to do something different based on whether
listenChannels was empty or not?  The same goes for Exec_UnlistenCommit.

This looks wrong in isolationtester.c:

@@ -487,7 +488,7 @@ run_permutation(TestSpec *testspec, int nsteps, Step
**steps)
res = PQexec(conns[0], testspec->setupsqls[i]);
if (PQresultStatus(res) == PGRES_TUPLES_OK)
{
-   printResultSet(res);
+   printResultSet(res, conns[i + 1]);

(conns[0] vs. conns[i + 1]).

Moving to Waiting on Author.


.m


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-09-07 Thread Tom Lane
I wrote:
> Dmitriy Sarafannikov  writes:
>> [ snapshot_non_vacuumable_v3.patch ]

> In short I think we should just set up the threshold as RecentGlobalXmin.

Pushed with that adjustment and some fooling with the comments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_destination=file

2017-09-07 Thread Magnus Hagander
On Thu, Sep 7, 2017 at 7:02 AM, Tom Lane  wrote:

> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On 31 August 2017 at 14:49, Tom Lane  wrote:
> >> pgbench with log_statement = all would be a pretty easy test case.
>
> > It seems that for this particular workload it was about 20-25% slower.
>
> Ouch.  That seems like rather a large hit :-(.  I actually expected
> it to be close to negligible, but I don't think 20% qualifies.
>

Agreed, that's a lot more than I expected. A few questions though:

1. where was stderr actually sent? To the console, to /dev/null or to a
file?

2. Could you run the same thing (on the same machine) with the logging
collector on and logging to file, without the patch? I'd assume that gives
performance similar to running with the patch, but it would be good to
confirm that.

And thanks for running the benchmark, saved me some time!


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-09-07 Thread Thomas Munro
On Fri, Sep 8, 2017 at 10:24 AM, Thomas Munro
 wrote:
> On Wed, Aug 16, 2017 at 2:13 PM, Masahiko Sawada  
> wrote:
>> The previous patch conflicts with current HEAD, I rebased the patch to
>> current HEAD.
>
> Hi Masahiko-san,

Hi Sawada-san,

I have just learned from a colleague who is knowledgeable about
Japanese customs and kind enough to correct me that the appropriate
term of address for our colleagues in Japan on this mailing list is
-san.  I was confused about that -- apologies for my
clumsiness.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-09-07 Thread Thomas Munro
On Wed, Aug 16, 2017 at 2:13 PM, Masahiko Sawada  wrote:
> The previous patch conflicts with current HEAD, I rebased the patch to
> current HEAD.

Hi Masahiko-san,

FYI this doesn't build anymore.  I think it's just because the wait
event enumerators were re-alphabetised in pgstat.h:

../../../../src/include/pgstat.h:820:2: error: redeclaration of
enumerator ‘WAIT_EVENT_LOGICAL_SYNC_DATA’
  WAIT_EVENT_LOGICAL_SYNC_DATA,
  ^
../../../../src/include/pgstat.h:806:2: note: previous definition of
‘WAIT_EVENT_LOGICAL_SYNC_DATA’ was here
  WAIT_EVENT_LOGICAL_SYNC_DATA,
  ^
../../../../src/include/pgstat.h:821:2: error: redeclaration of
enumerator ‘WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE’
  WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE,
  ^
../../../../src/include/pgstat.h:807:2: note: previous definition of
‘WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE’ was here
  WAIT_EVENT_LOGICAL_SYNC_STATE_CHANGE,
  ^

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel

2017-09-07 Thread Tomas Vondra
Hi,

On 07/21/2017 03:40 PM, Alexander Korotkov wrote:
> Hi, Alexey!
> 
> On Fri, Jul 21, 2017 at 3:05 PM, Alexey Chernyshov
> > wrote:
> 
> the following patch transfers functionality from gevel module
> (http://www.sai.msu.su/~megera/wiki/Gevel
> ) which provides functions for
> analyzing GIN and GiST indexes to pageinspect. Gevel was originally
> designed by Oleg Bartunov, and Teodor Sigaev for developers of GiST and
> GIN indexes.
> 
> 
> It's very good that you've picked up this work!  pageinspect is lacking
> of this functionality.  
> 
> Functions added:
>  - gist_stat(text) - shows statistics on GiST Tree
>  - gist_tree(text) - shows GiST tree
>  - gist_tree(text, int4) - shows GiST tree up to MAXLEVEL
>  - gist_print(text) - prints objects stored in GiST tree
>  - spgist_stat(text) - shows statistics on SP-GiST
>  - spgist_print(text) - prints objects stored in index
>  - gin_value_count() - originally gin_stat(text) - prints estimated
> counts
> for index values
>  - gin_stats() - originally gin_statpage(text) - shows statistics
>  - gin_count_estimate(text, tsquery) - shows number of indexed rows
> matched
> query
> 
> Tests also transferred, docs for new functions are added. I run pgindent
> over the code, but the result is different from those I expected, so
> I leave
> pgindented one.
> The patch is applicable to the commit
> 866f4a7c210857aa342bf901558d170325094dde.
> 
> 
> As we can see, gevel contains functions which analyze the whole index.
>  pageinspect is written in another manner: it gives you functionality to
> analyze individual pages, tuples and so on.
> Thus, we probably shouldn't try to move gevel functions to pageinspect
> "as is".  They should be rewritten in more granular manner as well as
> other pageinspact functions are.  Any other opinions?
>  

I agree with Alexander that pageinspect is written in a very different
way - as the extension name suggests, it's used to inspect pages. The
proposed patch uses a very different approach, reading the whole index,
not individual pages. Why should it be part of pageinspect?

For example we have pgstattuple extension, which seems like a better
match. Or just create a new extension - if the code is valuable, surely
we can live one more extension instead of smuggling it in inside
pageinspect.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-09-07 Thread Magnus Hagander
On Thu, Sep 7, 2017 at 2:34 PM, Tomas Vondra 
wrote:

> Hi,
>
> On 09/04/2017 04:24 PM, Bruce Momjian wrote:
> > On Fri, Sep  1, 2017 at 12:09:35PM -0400, Robert Haas wrote:
> >> I think that what this shows is that the current set of GUCs is overly
> >> OpenSSL-centric.  We created a set of GUCs that are actually specific
> >> to one particular implementation but named them as if they were
> >> generic.  My idea about this would be to actually rename the existing
> >> GUCs to start with "openssl" rather than "ssl", and then add new GUCs
> >> as needed for other SSL implementations.
> >>
> >> Depending on what we think is best, GUCs for an SSL implementation
> >> other than the one against which we compiled can either not exist at
> >> all, or can exist but be limited to a single value (e.g. "none", as we
> >> currently do when the compile has no SSL support at all).  Also, we
> >> could add a read-only GUC with a name like ssl_library that exposes
> >> the name of the underlying SSL implementation - none, openssl, gnutls,
> >> or whatever.
> >>
> >> I think if we go the route of insisting that every SSL implementation
> >> has to use the existing GUCs, we're just trying to shove a square peg
> >> into a round hole, and there's no real benefit for users.  If the
> >> string that has to be stuffed into ssl_ciphers differs based on which
> >> library was chosen at compile time, then you can't have a uniform
> >> default configuration for all libraries anyway.  I think it'll be
> >> easier to explain and document this if there's separate documentation
> >> for openssl_ciphers, gnutls_ciphers, etc. rather than one giant
> >> documentation section that tries to explain every implementation
> >> separately.
> >
> > I am worried about having 3x version of TLS controls in
> > postgresql.conf, and only one set being active. Perhaps we need to
> > break out the TLS config to separate files or something. Anyway, this
> > needs more thought.
> >
>
> Well, people won't be able to set the inactive options, just like you
> can't set ssl=on when you build without OpenSSL support. But perhaps we
> could simply not include the inactive options into the config file, no?
>

We build the pg_hba.conf file dynamically depending on if we have ipv6
support, IIRC. Maybe we need to implement that type of support into
postgresql.conf as well?

It will still be a mess though -- documentation, and tutorials around and
whatnot, will be dependent on library. But I'm not sure we can really get
around that.

Do we have some examples of how other products that support multiple
libraries do to handle this?


>
> I don't see how breaking the TLS config into separate files improves the
> situation - instead of dealing with GUCs in a single file you'll be
> dealing with the same GUCs in multiple files. No?
>

+1. I don't think splitting them up into different files makes it in any
way better -- if anything, it makes it worse.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] The case for removing replacement selection sort

2017-09-07 Thread Robert Haas
On Thu, Sep 7, 2017 at 5:38 PM, Tomas Vondra
 wrote:
> Do we need/want to repeat some of that benchmarking on these patches? I
> don't recall how much this code changed since those benchmarks were done
> in the 9.6 cycle.

+1 for some new benchmarking.  I'm all for removing this code if we
don't need it any more, but it'd be a lot better to have more numbers
before deciding that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel worker error

2017-09-07 Thread Robert Haas
On Mon, Sep 4, 2017 at 5:46 AM, Amit Kapila  wrote:
> It seems like the consensus is to move forward with this approach.  I
> have written a patch implementing the above idea.  Note, that to use
> SetCurrentRoleId, we need the value of guc "is_superuser" for the
> current user and we don't pass this value to parallel workers as this
> is PGC_INTERNAL guc variable.  So, I have passed this value via
> FixedParallelState.

If I apply this patch and run 'make check', the tests all pass.  If I
then revert all the changes to parallel.c and keep only the changes to
guc.c, the tests still pass.  IOW, the part of this patch that makes
the tests pass is the changes to guc.c, that just skip saving and
restoring the role altogether.

It looks to me like we need to get all of the following variables from
miscinit.c set to the correct values:

static Oid  AuthenticatedUserId = InvalidOid;
static Oid  SessionUserId = InvalidOid;
static Oid  OuterUserId = InvalidOid;
static Oid  CurrentUserId = InvalidOid;
static bool AuthenticatedUserIsSuperuser = false;
static bool SessionUserIsSuperuser = false;
static int  SecurityRestrictionContext = 0;
static bool SetRoleIsActive = false;

To do that, I think we first need to call InitializeSessionUserId(),
which will set AuthenticatedUserId and AuthenticatedUserIsSuperuser.
Then, we need to call SetSessionUserId(), which will set SessionUserId
and SessionUserIsSuperuser.  Then, we need to call SetCurrentRoleId(),
which will set SetRoleIsActive and OuterUserId.  Then finally we need
to call SetUserIdAndSecContext, which sets CurrentUserId and
SecurityRestrictionContext.  The order matters, because the earlier
functions in this series overwrite the values set by later ones as a
side-effect.  Furthermore, it matters in each case that the values
passed to those functions are taken from the corresponding variables
in the leader.

In the unpatched code, looking at ParallelWorkerMain(), we call
BackgroundWorkerInitializeConnectionByOid() first; that calls
InitializeSessionUserId().  Next we call RestoreGUCState(), which
because of the restore-the-role-last hack calls SetSessionUserId (when
session_authorization is restored) followed by SetCurrentRoleId()
(when role is restored).  Finally it calls SetUserIdAndSecContext().

With your patch, the order is wrong.  SetCurrentRoleId() is called
AFTER SetUserIdAndSecContext().  Furthermore, the role ID passed to
SetCurrentRoleId() is always the same one passed to
SetUserIdAndSecContext().  But those roles might be different.
fps->current_user_id is set by a call to GetUserIdAndSecContext(),
which reads CurrentUserId, not OuterUserId.  I think you need to pass
the value returned by GetCurrentRoleId() as an additional element of
FixedParallelState.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The case for removing replacement selection sort

2017-09-07 Thread Tomas Vondra
Hi,

On 08/31/2017 02:56 AM, Peter Geoghegan wrote:
> On Wed, Aug 30, 2017 at 5:38 PM, Robert Haas  wrote:
>> Wow.  Just to be clear, I am looking for the BEST case for replacement
>> selection, not the worst case.  But I would have expected that case to
>> be a win for replacement selection, and it clearly isn't.  I can
>> reproduce your results here.
> 
> But I *was* trying to get a best case. That's why it isn't even worse.
> That's what the docs say the best case is, after all.
> 
> This is the kind of thing that replacement selection actually did do
> better with on 9.6. I clearly remember Tomas Vondra doing lots of
> benchmarking, showing some benefit with RS with a work_mem of 8MB or
> less. As I said in my introduction on this thread, we weren't wrong to
> add replacement_sort_tuples to 9.6, given where things were with
> merging at the time. But, it does very much appear to create less than
> zero benefit these days. The picture changed.
> 

Do we need/want to repeat some of that benchmarking on these patches? I
don't recall how much this code changed since those benchmarks were done
in the 9.6 cycle.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GnuTLS support

2017-09-07 Thread Tomas Vondra
Hi,

On 09/04/2017 04:24 PM, Bruce Momjian wrote:
> On Fri, Sep  1, 2017 at 12:09:35PM -0400, Robert Haas wrote:
>> I think that what this shows is that the current set of GUCs is overly
>> OpenSSL-centric.  We created a set of GUCs that are actually specific
>> to one particular implementation but named them as if they were
>> generic.  My idea about this would be to actually rename the existing
>> GUCs to start with "openssl" rather than "ssl", and then add new GUCs
>> as needed for other SSL implementations.
>>
>> Depending on what we think is best, GUCs for an SSL implementation
>> other than the one against which we compiled can either not exist at
>> all, or can exist but be limited to a single value (e.g. "none", as we
>> currently do when the compile has no SSL support at all).  Also, we
>> could add a read-only GUC with a name like ssl_library that exposes
>> the name of the underlying SSL implementation - none, openssl, gnutls,
>> or whatever.
>>
>> I think if we go the route of insisting that every SSL implementation
>> has to use the existing GUCs, we're just trying to shove a square peg
>> into a round hole, and there's no real benefit for users.  If the
>> string that has to be stuffed into ssl_ciphers differs based on which
>> library was chosen at compile time, then you can't have a uniform
>> default configuration for all libraries anyway.  I think it'll be
>> easier to explain and document this if there's separate documentation
>> for openssl_ciphers, gnutls_ciphers, etc. rather than one giant
>> documentation section that tries to explain every implementation
>> separately.
> 
> I am worried about having 3x version of TLS controls in
> postgresql.conf, and only one set being active. Perhaps we need to
> break out the TLS config to separate files or something. Anyway, this
> needs more thought.
> 

Well, people won't be able to set the inactive options, just like you
can't set ssl=on when you build without OpenSSL support. But perhaps we
could simply not include the inactive options into the config file, no?

I don't see how breaking the TLS config into separate files improves the
situation - instead of dealing with GUCs in a single file you'll be
dealing with the same GUCs in multiple files. No?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-09-07 Thread Tom Lane
Dmitriy Sarafannikov  writes:
> [ snapshot_non_vacuumable_v3.patch ]

Starting to look at this.  I think that the business with choosing
RecentGlobalXmin vs. RecentGlobalDataXmin is just wrong.  What we
want to do is accept any tuple that would not be considered killable
by heap_hot_search_buffer, and that looks only at RecentGlobalXmin.

It's possible that it'd be sensible for heap_hot_search_buffer to
use RecentGlobalDataXmin when dealing with a non-catalog table,
allowing it to consider more tuples killable.  But I'm not sure;
it looks like the decision which to use costs some effort, which
we probably don't want to be spending in heap_hot_search_buffer.
In any case that would be a different patch topic.

With the patch as it stands, if we have a user-data tuple that is dead
more recently than RecentGlobalXmin but not more recently than
RecentGlobalDataXmin, the snapshot will reject it so the planner's
indexscan will keep searching.  But heap_hot_search_buffer won't
tell the index AM to kill the tuple, so the index entry stays marked
as live, which means the next get_actual_variable_range call will have
to scan past it again.  So we don't get the hoped-for benefit with
tuple deaths in that range.  I do not know whether this effect might
explain your finding that the patch didn't entirely fix your performance
problem.

In short I think we should just set up the threshold as RecentGlobalXmin.
However, this seems like a decision that might be pretty specific to
get_actual_variable_range's use-case, so I'm inclined to have the
threshold be chosen by get_actual_variable_range itself not hard-wired
into InitNonVacuumableSnapshot.

Another point is that I think it might be possible for RecentGlobalXmin
to advance after get_actual_variable_range looks at it (for example,
if we have to take a new catalog snapshot during index scan setup).
This creates the possibility that the snapshot accepts tuples that
heap_hot_search_buffer would consider dead.  That seems fairly harmless
though, so I'm inclined not to fuss about it.  We could arrange for
HeapTupleSatisfiesNonVacuumable to use RecentGlobalXmin directly,
but that seems like it makes it too special-purpose.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove 1MB size limit in tsvector

2017-09-07 Thread Tomas Vondra
Hi,

On 08/17/2017 12:23 PM, Ildus Kurbangaliev wrote:
> In my benchmarks when database fits into buffers (so it's measurement of
> the time required for the tsvectors conversion) it gives me these
> results:
> 
> Without conversion:
> 
> $ ./tsbench2 -database test1 -bench_time 300
> 2017/08/17 12:04:44 Number of connections:  4
> 2017/08/17 12:04:44 Database:  test1
> 2017/08/17 12:09:44 Processed: 51419
> 
> With conversion:
> 
> $ ./tsbench2 -database test1 -bench_time 300
> 2017/08/17 12:14:31 Number of connections:  4
> 2017/08/17 12:14:31 Database:  test1
> 2017/08/17 12:19:31 Processed: 43607
> 
> I ran a bunch of these tests, and these results are stable on my
> machine. So in these specific tests performance regression about 15%.
> 
> Same time I think this could be the worst case, because usually data
> is on disk and conversion will not affect so much to performance.
> 

That seems like a fairly significant regression, TBH. I don't quite
agree we can simply assume in-memory workloads don't matter, plenty of
databases have 99% cache hit ratio (particularly when considering not
just shared buffers, but also page cache).

Can you share the benchmarks, so that others can retry running them?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks to track changed pages for backup purposes

2017-09-07 Thread Tomas Vondra
Hi,

On 09/01/2017 08:13 AM, Andrey Borodin wrote:
> Thank you for your reply, Michael! Your comments are valuable,
especially in the world of backups.
>
>> 31 авг. 2017 г., в 19:44, Michael Paquier 
написал(а):
>> Such things are not Postgres-C like.
> Will be fixed.
>

A few more comments:

* The patch defines wal_switch_hook, but it's never called.

* I see there are conditions like this:

if(xlogreader->blocks[nblock].forknum == MAIN_FORKNUM)

Why is it enough to restrict the block-tracking code to main fork?
Aren't we interested in all relation forks?

>> I don't understand what xlog_begin_insert_hook() is good for.
> memset control structures and array of blocknos and relfilenodes of
the size XLR_MAX_BLOCK_ID .
>

Why can't that happen in the other hooks? I guess you'll have to explain
what the implementation of the hooks is supposed to do, and why these
locations for hook calls are the right ones. It's damn impossible to
validate the patch without that information.

Assuming you still plan to use the hook approach ...

>> There
>> are no arguments fed to this hook, so modules would not be able to
>> analyze things in this context, except shared memory and process
>> state?
>
>>
>> Those hooks are put in hot code paths, and could impact performance of
>> WAL insertion itself.
> I do not think sending few bytes to cached array is comparable to disk
write of XLog record. Checking the func ptr is even cheaper with correct
branch prediction.
>

That seems somewhat suspicious, for two reasons. Firstly, I believe we
only insert the XLOG records into WAL buffer here, so why should there
be any disk write related? Or do you mean the final commit?

But more importantly, doesn't this kind of information require some
durability guarantees? I mean, if it gets lost during server crashes or
restarts, doesn't that mean the incremental backups might miss some
buffers? I'd guess the hooks will have to do some sort of I/O, to
achieve that, no?

In any case, claims like this probably deserve some benchmarking.

>> So you basically move the cost of scanning WAL
>> segments for those blocks from any backup solution to the WAL
>> insertion itself. Really, wouldn't it be more simple to let for
>> example the archiver process to create this meta-data if you just want
>> to take faster backups with a set of segments? Even better, you could
>> do a scan after archiving N segments, and then use M jobs to do this
>> work more quickly. (A set of background workers could do this job as
>> well).
> I like the idea of doing this during archiving. It is different
trade-off between performance of OLTP and performance of backuping.
Essentially, it is parsing WAL some time before doing backup. The best
thing about it is usage of CPUs that are usually spinning in idle loop
on backup machines.
>
>> In the backup/restore world, backups can be allowed to be taken at a
>> slow pace, what matters is to be able to restore them quickly.
> Backups are taken much more often than restored.
>

I agree. The ability to take backups quickly (and often) is just as
important as fast recovery.

>> In short, anything moving performance from an external backup code path
>> to a critical backend code path looks like a bad design to begin with.
>> So I am dubious that what you are proposing here is a good idea.
> I will think about it more. This proposal takes vanishingly small part
of backend performance, but, indeed, nonzero part.
>

But I think this is not necessarily just about code paths - moving it to
the archiver or a bunch of background workers may easily have just as
negative effect (or event worse), as it's using the same CPUs, has to
actually re-read the WAL segments from disk (which may be a lot of I/O,
particularly when done from multiple processes).

>From this POV, the idea to collect this information on the backup system
(WAL archive) by pre-processing the arriving WAL segments seems like the
most promising. It moves the work to another system, the backup system
can make it as durable as the WAL segments, etc.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-07 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Sep 7, 2017 at 2:40 PM, Amit Langote
>  wrote:
>> On 2017/09/07 13:09, Michael Paquier wrote:
>>> pd_lower should remain at 0 on pre-10 servers.

>> Doesn't PageInit(), which is where any page gets initialized, has always
>> set pd_lower to SizeOfPageHeaderData?

> Yes, sorry. I had a mind slippery here..

Indeed, which raises the question of how did any of this code work at all?
Up to now, these pages have been initialized with pd_lower =
SizeOfPageHeaderData, *not* zero.  Which means that the FPI code would
have considered the metadata to be part of a valid "hole" and not stored
it, if we'd ever told the FPI code that the metadata page was of standard
format.  I've just looked through the code for these three index types and
can't find any place where they do that --- for instance, they don't pass
REGBUF_STANDARD to XLogRegisterBuffer when writing their metapages, and
they pass page_std = false to log_newpage when using that for metapages.
Good thing, because they'd be completely broken otherwise.

This means that the premise of this patch is wrong.  Adjusting pd_lower,
by itself, would accomplish precisely zip for WAL compression, because
we'd still not be telling the WAL code to compress out the hole.

To get any benefit, I think we'd need to do all of the following:

1. Initialize pd_lower correctly in the metapage init functions, as here.

2. Any place we are about to write the metapage, set its pd_lower to the
correct value, in case it's an old index that doesn't have that set
correctly.  Fortunately this is cheap enough that we might as well just
do it unconditionally.

3. Adjust all the xlog calls to tell them the page is of standard format.

Now, one advantage we'd get from this is that we could say confidently
that any index metapage appearing in a WAL stream generated by v11 or
later has got the right pd_lower; therefore we could dispense with
checking for wrong pd_lower in the mask functions (unless they're used
in some way I don't know about, which is surely possible).

BTW, while nbtree correctly initializes pd_lower, it looks to me like it
is not exploiting that, because it seems never to pass REGBUF_STANDARD for
the metapage anyway.  I think this doesn't matter performance-wise for
nbtree, because it seems to always pass REGBUF_WILL_INIT instead.
But if we do likewise in other index types then they aren't really going
to win anyway.  GIN at least looks like it might do that; I have not
gone through any of the index types in detail.

In short, this patch needs a significant rewrite, and more analysis than
you've done so far on whether there's actually any benefit to be gained.
It might not be worth messing with.

I'll set the CF entry back to Waiting on Author.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-09-07 Thread Fabien COELHO


Hello Pavel,


I checked performance - the most fast queries are execution of simple
prepared statement

prepare x as select 1;
-- 100 x
execute x;
execute x;
execute x;
execute x;

## patched
[pavel@nemesis postgresql]$ time psql -At -1 postgres -f ~/xxx.sql >
/dev/null

real 0m44,887s
user 0m11,703s
sys 0m6,942s

This is probably the most worst case, what is possible and see some
slowdown - in this case there is about 10% slowdown -

but it is pretty untypical - the one query was less than 50 microsec. When
there will be any IO activity or network usage, than this patch doesn't
create any significant overhead.


Interesting. Thanks for the test.

I tried to replicate with a variant without any output: "SELECT;"

  SELECT NOW() AS start \gset
  BEGIN;
  SELECT; -- 2^19 times
  END;
  SELECT NOW() - :'start';

The run time is about 19 µs per SELECT on my laptop. Over 33 runs each 
alternating master with and without the patch, I got the following stats 
on the measured time in seconds (average, stddev [min Q1 median Q3 max]):


 - with   : 9.898 ± 0.158 [9.564, 9.762, 9.936, 10.037, 10.108]
 - without: 9.419 ± 0.294 [8.670, 9.226, 9.533, 9.625, 9.845]

This seems consistent and significant. It suggests a 0.40-0.50 s 
difference, that is about 5%, i.e. about (under) 1 µs overhead per 
statement in pretty defavorable circumstances.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-09-07 Thread Tomas Vondra
Hi,

Attached is an updated version of the patch, fixing the issues reported
by Adrien Nayrat, and also a bunch of issues pointed out by valgrind.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-MCV-lists.patch.gz
Description: application/gzip


0002-multivariate-histograms.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] path toward faster partition pruning

2017-09-07 Thread Robert Haas
On Thu, Sep 7, 2017 at 7:16 AM, Amit Langote
 wrote:
> There is a patch in the Ashutosh's posted series of patches, which does
> more or less the same thing that this patch does.  He included it in his
> series of patches, because, IIUC, the main partitionwise-join planning
> logic that one of the later patch implements depends on being able to
> consider applying that new planning technique individually for every
> partition sub-tree, instead of just at the whole tree root.
>
> One notable difference from his patch is that while his patch will expand
> in non-flattened manner even in the case where the parent is the result
> relation of a query, my patch doesn't in that case, because the new
> partition-pruning technique cannot be applied to inheritance parent that
> is a result relation, for example,
>
> update partitioned_table set ...
>
> And AFAICS, partitionwise-join cannot be applied to such a parent either.
>
> Note however that if there are other instances of the same partitioned
> table (in the FROM list of an update statement) or other partitioned
> tables in the query, they will be expanded in a non-flattened manner,
> because they are themselves not the result relations of the query.  So,
> the new partition-pruning and (supposedly) partitionwise-join can be
> applied for those other partitioned tables.

It seems to me that it would be better not to write new patches for
things that already have patches without a really clear explanation
with what's wrong with the already-existing patch; I don't see any
such explanation here.  Instead of writing your own patch for this to
duel with his his, why not review his and help him correct any
deficiencies which you can spot?  Then we have one patch with more
review instead of two patches with less review both of which I have to
read and try to decide which is better.

In this case, I think Ashutosh has the right idea.  I think that
handling the result-relation and non-result-relation differently
creates an unpleasant asymmetry.  With your patch, we have to deal
with three cases: (a) partitioned tables that were expanded
level-by-level because they are not result relations, (b) partitioned
tables that were expanded "flattened" because they are result
relations, and (c) non-partitioned tables that were expanded
"flattened".  With Ashutosh's approach, we only have two cases to
worry about in the future rather than three, and I like that better.

Your patch also appears to change things so that the table actually
referenced in the query ends up with an AppendRelInfo for the parent,
which seems pointless.  And it has no tests.

There are a couple of hunks from your patch that we might want or need
to incorporate into Ashutosh's patch.  The change to
relation_excluded_by_constraints() looks like it might be useful,
although it needs a better comment and some tests.  Also, Ashutosh's
patch has no equivalent of your change to add_paths_to_append_rel().
I'm not clear what the code you've introduced there is supposed to be
doing, and I'm suspicious that it is confusing "partition root" with
"table named in the query", which will often be the same but not
always; the user could have named an intermediate partition.  Can you
expand on what this is doing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-09-07 Thread Pavel Stehule
> >
> > Hmm.  I think the core problem here is that we're trying to control
> > the plancache, which is a pretty much behind-the-scenes mechanism.
> > Except in the case of an explicit PREPARE, you can't even see from
> > SQL that the cache is being used, or when it's used.  So part of what
> > needs to be thought about, if we use the GUC approach, is when the
> > GUC's value is consulted.  If we don't do anything special then
> > the GUC(s) would be consulted when retrieving plans from the cache,
> > and changes in their values from one retrieval to the next might
> > cause funny behavior.  Maybe the relevant settings need to be captured
> > when the plancache entry is made ... not sure.
>
> What sort of funny behavior are you concerned about?  It seems likely
> to me that in most cases the GUC will have the same value every time
> through, but if it doesn't, I'm not sure why we'd need to use the old
> value rather than the current one.  Indeed, if the user changes the
> GUC from "force custom" to "force generic" and reruns the function, we
> want the new value to take effect, lest a POLA violation occur.
>

good note - so changing this GUC on session level requires reset plan cache.

I am not against to GUC, and I am not against to PLpgSQL #option. Just, and
I am repeating (I am sorry) - these tools are not practical for usage in
PLpgSQL. There should be some block level possibility to do some setting.

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] psql - add special variable to reflect the last query status

2017-09-07 Thread Pavel Stehule
Hi

2017-09-06 11:14 GMT+02:00 Fabien COELHO :

>
> Here is a version 6.
>>
>
> Small v7 update, sorry for the noise.
>
> Add testing the initial state of all variables.
>
> Fix typos in a comment in tests.
>
> Fix the documentation wrt the current implementation behavior.


I rechecked last patch

There is not any issue with patching. All regress tests passed

I checked performance - the most fast queries are execution of simple
prepared statement

prepare x as select 1;
-- 100 x
execute x;
execute x;
execute x;
execute x;

## patched
[pavel@nemesis postgresql]$ time psql -At -1 postgres -f ~/xxx.sql >
/dev/null

real 0m44,887s
user 0m11,703s
sys 0m6,942s

This is probably the most worst case, what is possible and see some
slowdown - in this case there is about 10% slowdown -

but it is pretty untypical - the one query was less than 50 microsec. When
there will be any IO activity or network usage, than this patch doesn't
create any significant overhead.

I'll mark this patch as ready for commiter

Regards

Pavel


>
>
> --
> Fabien.


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-07 Thread Robert Haas
On Tue, Sep 5, 2017 at 7:01 AM, Ashutosh Bapat
 wrote:
> accumulate_append_subpath() is executed for every path instead of
> every relation, so changing it would collect the same list multiple
> times. Instead, I found the old way of associating all intermediate
> partitions with the root partitioned relation work better. Here's the
> updated patch set.

When I tried out patch 0001, it crashed repeatedly during 'make check'
because of an assertion failure in get_partitioned_child_rels.  It
seemed to me that the way the patch was refactoring
expand_inherited_rtentry involved more code rearrangement than
necessary, so I reverted all the code rearrangement and just kept the
functional changes, and all the crashes went away.  (That refactoring
also failed to initialize has_child properly.)  In so doing, I
reintroduced the problem that the PartitionedChildRelInfo lists
weren't getting set up correctly, but after some thought I realized
that was just because expand_single_inheritance_child() was choosing
between adding an RTE and adding the OID to partitioned_child_rels,
whereas for an intermediate partitioned table it needs to do both.  So
I inserted a trivial fix for that problem (replacing "else" with a new
"if"-test), basically:

-else
+
+if (childrte->relkind == RELKIND_PARTITIONED_TABLE)

Please check out the attached version of the patch.  In addition to
the above simplifications, I did some adjustments to the comments in
various places - some just grammar and others a bit more substantive.
And I think I broke a long line in one place, too.

One thing I notice is that if I rip out the changes to initsplan.c,
the new regression test still passes.  If it's possible to write a
test that fails without those changes, I think it would be a good idea
to include one in the patch.  That's certainly one of the subtler
parts of this patch, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


expand-stepwise-rmh.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assorted code cleanup

2017-09-07 Thread Tom Lane
Peter Eisentraut  writes:
> On 9/5/17 15:32, Tom Lane wrote:
>> I do agree with the idea that we should use the * notation in cases where
>> the reader might otherwise think that a plain function was being invoked,
>> ie I don't like
>> some_function_pointer(args);
>> Even if the compiler isn't confused, readers might be.  But in the case of
>> structname->pointerfield(args);
>> it's impossible to read that as a plain function call, so I'm okay with
>> dropping the extra punctuation there.

> Committed that way.  Thanks.

Is it worth memorializing this in the docs somewhere, perhaps
"53.4. Miscellaneous Coding Conventions" ?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Tom Lane
Simon Riggs  writes:
> I would like to relax the restriction to allow this specific use case...
>   SET work_mem = X; SET max_parallel_workers = 4; SELECT ...
> so we still have only one command (the last select), yet we have
> multiple GUC settings beforehand.

On what basis do you claim that's only one command?  It would return
multiple CommandCompletes, for starters, so that it breaks the protocol
just as effectively as any other loosening.

Moreover, I imagine the semantics you really want is that the SETs only
apply for the duration of the command.  This wouldn't provide that
result either.

Haas' idea of some kind of syntactic extension, like "LET guc1 = x,
guc2 = y FOR statement" seems more feasible to me.  I'm not necessarily
wedded to that particular syntax, but I think it has to look like
a single-statement construct of some kind.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Tom Lane
Simon Riggs  writes:
> On 5 September 2017 at 10:22, Tom Lane  wrote:
>> Does anyone want to do further review on this patch?  If so, I'll
>> set the CF entry back to "Needs Review".

> OK, I'll review Michael's patch (and confirm my patch is dead)

Not hearing anything, I already pushed my patch an hour or three ago.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Simon Riggs
On 7 September 2017 at 11:07, Tom Lane  wrote:
> I wrote:
>> Yeah, it seems like we have now made this behavior official enough that
>> it's time to document it better.  My thought is to create a new subsection
>> in the FE/BE Protocol chapter that explains how multi-statement Query
>> messages are handled, and then to link to that from appropriate places
>> elsewhere.  If anyone thinks the reference section would be better put
>> somewhere else than Protocol, please say where.
>
> I've pushed up an attempt at this:
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b976499480bdbab6d69a11e47991febe53865adc
>
> Feel free to suggest improvements.

Not so much an improvement as a follow-on thought:

All of this applies to simple queries.

At present we restrict using multi-statement requests in extended
protocol, saying that we don't allow it because of a protocol
restriction. The precise restriction is that we can't return more than
one reply. The restriction is implemented via this test
if (list_length(parsetree_list) > 1)
ereport(ERROR,
 (errcode(ERRCODE_SYNTAX_ERROR),
  errmsg("cannot insert multiple commands into a prepared
statement")));
at line 1277 of exec_parse_message()
which is actually more restrictive than it needs to be.

I would like to relax the restriction to allow this specific use case...
  SET work_mem = X; SET max_parallel_workers = 4; SELECT ...
so we still have only one command (the last select), yet we have
multiple GUC settings beforehand.

Any reason to disallow that?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Simon Riggs
On 5 September 2017 at 10:22, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Mon, Sep 4, 2017 at 11:15 PM, Tom Lane  wrote:
>>> I don't want to go there, and was thinking we should expand the new
>>> comment in DefineSavepoint to explain why not.
>
>> Okay.
>
> Does anyone want to do further review on this patch?  If so, I'll
> set the CF entry back to "Needs Review".

OK, I'll review Michael's patch (and confirm my patch is dead)

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Tom Lane
I wrote:
> Yeah, it seems like we have now made this behavior official enough that
> it's time to document it better.  My thought is to create a new subsection
> in the FE/BE Protocol chapter that explains how multi-statement Query
> messages are handled, and then to link to that from appropriate places
> elsewhere.  If anyone thinks the reference section would be better put
> somewhere else than Protocol, please say where.

I've pushed up an attempt at this:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b976499480bdbab6d69a11e47991febe53865adc

Feel free to suggest improvements.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assorted code cleanup

2017-09-07 Thread Peter Eisentraut
On 9/5/17 15:32, Tom Lane wrote:
> At one time there were C compilers that only accepted the former syntax.

Correct.  Explanation here: http://c-faq.com/ptrs/funccall.html

> I do agree with the idea that we should use the * notation in cases where
> the reader might otherwise think that a plain function was being invoked,
> ie I don't like
> 
>   some_function_pointer(args);
> 
> Even if the compiler isn't confused, readers might be.  But in the case of
> 
>   structname->pointerfield(args);
> 
> it's impossible to read that as a plain function call, so I'm okay with
> dropping the extra punctuation there.

Committed that way.  Thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list

2017-09-07 Thread Pavel Stehule
2017-09-07 19:48 GMT+02:00 Pavel Stehule :

>
>
> 2017-09-07 8:08 GMT+02:00 Anthony Bykov :
>
>> The following review has been posted through the commitfest application:
>> make installcheck-world:  not tested
>> Implements feature:   not tested
>> Spec compliant:   not tested
>> Documentation:not tested
>>
>> I'm afraid this patch conflicts with current master branch.
>>
>> The new status of this patch is: Waiting on Author
>>
>
> rebased
>

I am sorry - wrong patch

I am sending correct patch

Regards

Pavel

>
>
>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>


plpgsql-into-multitarget-2.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list

2017-09-07 Thread Pavel Stehule
2017-09-07 8:08 GMT+02:00 Anthony Bykov :

> The following review has been posted through the commitfest application:
> make installcheck-world:  not tested
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> I'm afraid this patch conflicts with current master branch.
>
> The new status of this patch is: Waiting on Author
>

rebased



>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index a637551..49327b2 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -409,6 +409,7 @@ do_compile(FunctionCallInfo fcinfo,
 			for (i = 0; i < numargs; i++)
 			{
 char		buf[32];
+char	   *argname = NULL;
 Oid			argtypeid = argtypes[i];
 char		argmode = argmodes ? argmodes[i] : PROARGMODE_IN;
 PLpgSQL_type *argdtype;
@@ -433,9 +434,12 @@ do_compile(FunctionCallInfo fcinfo,
 		   errmsg("PL/pgSQL functions cannot accept type %s",
   format_type_be(argtypeid;
 
+argname = (argnames && argnames[i][0] != 0) ? argnames[i] : NULL;
+
 /* Build variable and add to datum list */
-argvariable = plpgsql_build_variable(buf, 0,
-	 argdtype, false);
+argvariable = plpgsql_build_variable(argname != NULL ?
+		   argname : buf,
+		   0, argdtype, false);
 
 if (argvariable->dtype == PLPGSQL_DTYPE_VAR)
 {
@@ -461,9 +465,9 @@ do_compile(FunctionCallInfo fcinfo,
 add_parameter_name(argitemtype, argvariable->dno, buf);
 
 /* If there's a name for the argument, make an alias */
-if (argnames && argnames[i][0] != '\0')
+if (argname != '\0')
 	add_parameter_name(argitemtype, argvariable->dno,
-	   argnames[i]);
+	   argname);
 			}
 
 			/*
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
index 7ebbde6..a9fe736 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -5979,3 +5979,15 @@ LINE 1: SELECT (SELECT string_agg(id || '=' || name, ',') FROM d)
^
 QUERY:  SELECT (SELECT string_agg(id || '=' || name, ',') FROM d)
 CONTEXT:  PL/pgSQL function alter_table_under_transition_tables_upd_func() line 3 at RAISE
+-- should fail -- message should to contain argument name
+CREATE TYPE ct AS (a int, b int);
+CREATE OR REPLACE FUNCTION fx(x ct)
+RETURNS void AS $$
+BEGIN
+  GET DIAGNOSTICS x = ROW_COUNT;
+end;
+$$ language plpgsql;
+ERROR:  "x" is not a scalar variable
+LINE 4:   GET DIAGNOSTICS x = ROW_COUNT;
+  ^
+DROP TYPE ct;
diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql
index 60d1d38..4716ac0 100644
--- a/src/test/regress/sql/plpgsql.sql
+++ b/src/test/regress/sql/plpgsql.sql
@@ -4766,3 +4766,15 @@ ALTER TABLE alter_table_under_transition_tables
   DROP column name;
 UPDATE alter_table_under_transition_tables
   SET id = id;
+
+-- should fail -- message should to contain argument name
+CREATE TYPE ct AS (a int, b int);
+
+CREATE OR REPLACE FUNCTION fx(x ct)
+RETURNS void AS $$
+BEGIN
+  GET DIAGNOSTICS x = ROW_COUNT;
+end;
+$$ language plpgsql;
+
+DROP TYPE ct;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Secondary index access optimizations

2017-09-07 Thread Konstantin Knizhnik



On 07.09.2017 13:00, Thomas Munro wrote:

On Sun, Sep 3, 2017 at 4:34 AM, Konstantin Knizhnik
 wrote:

Thank you for review.
I attached new version of the patch with
remove_restrictions_implied_by_constraints() function.
Concerning failed tests - this is actually result of this optimization:
extra filter conditions are removed from query plans.
Sorry, I have not included updated version of expected test output files to
the patch.
Now I did it.

A regression test under contrib/postgres_fdw now fails:

- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T
1" WHERE (("C 1" IS NOT NULL))
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"

(("C 1" IS NOT NULL)) is indeed redundant in that case, because column
"C 1" was declared to be NOT NULL.  But:

1.  Do we want to go this far?  Note that this is not involving
inheritance and constraint exclusion.  I don't immediately see any
reason why not, but I'm not sure.

2.  If yes, then this postgres_fdw test should be changed, because I
think it was trying to demonstrate that IS NOT NULL expressions are
sent to remote databases -- it would need to be changed so that it
tries that with a column that is actually nullable.

I do not see any reasons why we should disable this optimization in case 
of FDW.

And disabling it requires some extra efforts...

So I have updated test for postgres_fdw, replacing query
 SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;
with
 SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;

Now it checks two things:
1. That not null check is passed to foreign server for nullable column (c3)
2. That not null check is excluded from query execution plan when it can 
be omitted because column is not nullable.


Updated version of the patch is attached to this mail.
Also I added support of date type to operator_predicate_proof to be able 
to imply (logdate <= '2017-03-31') from (logdate < '2017-04-01') .


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c19b331..a9cce14 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 5f65d9d..2d816db 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 2d7e1d8..5de67ce 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -344,6 +344,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* 

Re: [HACKERS] logical decoding of two-phase transactions

2017-09-07 Thread Nikhil Sontakke
Hi,

FYI all, wanted to mention that I am working on an updated version of
the latest patch that I plan to submit to a later CF.

Regards,
Nikhils

On 14 May 2017 at 04:02, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> On 13 May 2017 at 22:22, Tom Lane  wrote:
>>
>> Apparently you are not testing against current HEAD.  That's been there
>> since d10c626de (a whole two days now ;-))
>
> Indeed, I was working on a more than two-day old antiquity. Unfortunately,
> it's even more complicated
> to apply this patch against the current HEAD, so I'll wait for a rebased
> version.



-- 
 Nikhil Sontakke   http://www.2ndQuadrant.com/
 PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-07 Thread Robert Haas
On Wed, Sep 6, 2017 at 9:42 PM, Amit Langote
 wrote:
> I too tend to think that any users who use this masking facility would
> know to expect to get these failures on upgraded clusters with invalid
> pd_lower in meta pages.

I strongly disagree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tuple-routing for certain partitioned tables not working as expected

2017-09-07 Thread Robert Haas
On Thu, Sep 7, 2017 at 5:13 AM, Etsuro Fujita
 wrote:
> On 2017/08/30 17:20, Etsuro Fujita wrote:
>> Here is a patch to skip the CheckValidResultRel checks for a target table
>> if it's a foreign partition to perform tuple-routing for, as proposed by
>> Robert.
>
> In the patch, to skip the checks, I passed to CheckValidResultRel a new flag
> indicating whether the target relation is a partition to do tuple-routing
> for, but while updating another patch for tuple-routing for foreign
> partitions in PG11, I noticed that it would be better to pass ResultRelInfo
> than that flag.  The reason is: (1) we can see whether the relation is such
> a partition by looking at the ResultRelInfo's ri_PartitionRoot, instead of
> that flag, and (2) this is for tuple-routing for foreign partitions, but we
> could extend that function with that argument to do the checks without
> throwing an error and save the result in a new member of ResultRelInfo (eg,
> ri_PartitionIsValid) so that we can use that info to determine in ExecInsert
> whether a foreign partition chosen by ExecFindPartition is valid for
> tuple-routing.  So, I updated the patch that way.  Attached is an updated
> version of the patch.

OK, committed to master and v10.  I am less convinced than you that
this was a must-fix issue, but it's not a very invasive change the way
you did it here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-07 Thread Chris Travers
On Thu, Sep 7, 2017 at 2:47 PM, Alvaro Herrera 
wrote:

> After reading this discussion, I agree that pg_rewind needs to become
> smarter in order to be fully useful in production environments; right
> now there are many warts and corner cases that did not seem to have been
> considered during the initial development (which I think is all right,
> taking into account that its mere concept was complicated enough; so we
> need not put any blame on the original developers, rather the contrary).
>

Agreed with this assessment.  And as a solution to the problem of "base
backups take too long to take and transfer" the solution and the corner
cases make a lot of sense.

>
> I think we need to make the program simple to use (i.e. not have the
> user write shell globs for the common log file naming patterns) while
> remaining powerful (i.e. not forcibly copy any files that do not match
> hardcoded globs).


I would add that well-defined tasks are a key aspect of powerful software
in my view and here the well defined task is to restore data states to a
particular usable timeline point taken from another system.  If that is
handled well, that opens up new uses and makes some problems that are
difficult right now much easier to solve.


>   Is the current dry-run mode enough to give the user
> peace of mind regarding what would be done in terms of testing globs
> etc?  If not, maybe the debug mode needs to be improved (for instance,
> have it report the file size for each file that would be copied;
> otherwise you may not notice it's going to copy the 20GB log file until
> it's too late ...)
>

The dry run facility solves one problem in one circumstance, namely a
manually invoked run of the software along with the question of "will this
actually re-wind?"  I suppose software developers might be able to use it
to backup and restore things that are to be clobbered (but is anyone likely
to on the software development side?).  I don't see anything in that corner
that can be improved without over engineering the solution.

There are two reasons I am skeptical that a dry-run mode will ever be
"enough."

The first is that pg_rewind is often integrated into auto-failover/back
tools and the chance of running it in a dry-run mode before it is
automatically triggered is more or less nil.  These are also the cases
where you won't notice it does something bad until much later.

The second is that there are at least some corner cases we may need to
define as outside the responsibility of pg_rewind.  The one that comes to
mind here is if I am rewinding back past the creation of a small table.  I
don't see an easy or safe way to address that from inside pg_rewind without
a lot of complication.  It might be better to have a dedicated tool for
that.


>
> Now, in order for any of this to happen, there will need to be a
> champion to define what the missing pieces are, write all those patches
> and see them through the usual (cumbersome, slow) procedure.  Are you,
> Chris, willing to take the lead on that?
>

 Yeah. I think the first step would be list of the corner cases and a
proposal for how I think it should work.  Then maybe a roadmap of patches,
and then submitting them as they become complete.


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] 【ECPG】strncpy function does not set the end character '\0'

2017-09-07 Thread Michael Meskes
> Oh.  If there's actually a standard somewhere that says it's not
> null-terminated, then code that is expecting it to be so is just
> wrong.  No need to change anything in ecpg, IMO.

As I said I haven't checked if this detail is actually in there, but I
guess it was because there is a reason why we, and other databases,
define it that way.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 【ECPG】strncpy function does not set the end character '\0'

2017-09-07 Thread Tom Lane
Michael Meskes  writes:
> With "supposed" I was referring to the standard that defines SQLCA.

Oh.  If there's actually a standard somewhere that says it's not
null-terminated, then code that is expecting it to be so is just
wrong.  No need to change anything in ecpg, IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 【ECPG】strncpy function does not set the end character '\0'

2017-09-07 Thread Michael Meskes
> > Why do you think there should be one? My memory might be wrong but
> > I
> > don't think it's supposed to be a null terminated string.
> 
> That field is defined as char[5] in struct sqlca_t, so the intent is
> clearly that it not be null terminated.  However, it looks to me like
> there'd be at least one alignment-padding byte after it, and that
> byte
> is likely to be 0 in a lot of situations (particularly for statically
> allocated sqlca_t's).  So a lot of the time, you could get away with
> using strcmp() or other functions that expect null termination.

With "supposed" I was referring to the standard that defines SQLCA.

> I'm thinking therefore that there's probably code out there that
> tries
> to do strcmp(sqlca->sqlstate, "22000") or suchlike, and it works
> often
> enough that the authors haven't identified their bug.  The question
> is
> do we want to try to make that be valid code.
> 
> Changing the field declaration to char[5+1] would be easy enough, but
> I have no idea how many places in ecpglib would need to change to
> make
> sure that the last byte gets set to 0.

I doubt it'll be a lot. However, it would make us differ, albeit very
slightly, from what others do. I haven't come up with a practical
problem coming from that difference though.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Aggregation push-down

2017-09-07 Thread Merlin Moncure
On Thu, Aug 17, 2017 at 10:22 AM, Antonin Houska  wrote:
> Antonin Houska  wrote:
> output type. For other aggregates (like avg()) the remote nodes will have to
> return the transient state value in an appropriate form (maybe bytea type),
> which does not depend on PG version.

Hm, that seems like an awful lot of work (new version agnostic
serialization format) for very little benefit (version independent
type serialization for remote aggregate pushdown).  How about forcing
the version to be the same for the feature to be used?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] advanced partition matching algorithm for partition-wise join

2017-09-07 Thread Antonin Houska
Ashutosh Bapat  wrote:

> I have fixed the issues which were marked as TODOs in the attached
> patches. Also, I have included your test change patch in my series of
> patches.

I've noticed that partition_bounds_merge() is called twice from
make_join_rel():

 * build_join_rel -> build_joinrel_partition_info -> partition_bounds_merge

 * try_partition_wise_join -> partition_bounds_merge

Is this intentional, or just a thinko?

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_destination=file

2017-09-07 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On 31 August 2017 at 14:49, Tom Lane  wrote:
>> pgbench with log_statement = all would be a pretty easy test case.

> It seems that for this particular workload it was about 20-25% slower.

Ouch.  That seems like rather a large hit :-(.  I actually expected
it to be close to negligible, but I don't think 20% qualifies.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present

2017-09-07 Thread Michael Banck
Hi,

not directly related to the topic, but:

On Tue, Mar 21, 2017 at 03:34:00PM -0400, Robert Haas wrote:
> For example, somebody creates a replica using the new super-easy
> method, and then blows it away without dropping the slot from the
> master, 

Just a thought, but maybe there should be some pg_dropstandby command or
similar, which cleans everything (what exactly?) up? That might go some
way to ensure this does not happen.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-09-07 Thread Tom Lane
Catalin Iacob  writes:
> On Mon, Sep 4, 2017 at 4:15 PM, Tom Lane  wrote:
>> Also, the main thing that we need xact.c's involvement for in the first
>> place is the fact that implicit transaction blocks, unlike regular ones,
>> auto-cancel on an error, leaving you outside a block not inside a failed
>> one.  So I don't exactly see how savepoints would fit into that.

> I think this hits the nail on the head and should have a place in the
> official docs as I now realize I didn't grasp this distinction before
> I read this.

Yeah, it seems like we have now made this behavior official enough that
it's time to document it better.  My thought is to create a new subsection
in the FE/BE Protocol chapter that explains how multi-statement Query
messages are handled, and then to link to that from appropriate places
elsewhere.  If anyone thinks the reference section would be better put
somewhere else than Protocol, please say where.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log_destination=file

2017-09-07 Thread Dmitry Dolgov
Hi

> On 31 August 2017 at 14:49, Tom Lane  wrote:
>> Are you actually asking for a benchmark of if logging gets slower?
>
> Yes.
>
>> If so,
>> could you suggest a workload to make an actual benchmark of it (where
>> logging would be high enough that it could be come a bottleneck -- and
not
>> writing the log data to disk, but the actual logging). I'm not sure what
a
>> good one would be.
>
> pgbench with log_statement = all would be a pretty easy test case.

This part of the discussion caught my attention, and I tried to perform such
easy test. I was doing:

pgbench -S -j2 -c${clients} -T500 test

with `log_statement=all` and `log_destination=stderr`, what I assume should
be
enough to get some approximation for numbers.

scaling factor: 100
average latency:

clients patch   master

10  1.827   1.456

20  4.027   3.300

30  6.284   4.921

40  8.409   6.767

50  10.985 8.646

It seems that for this particular workload it was about 20-25% slower.


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-09-07 Thread Alexey Chernyshov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

One thing I have noticed is a trailing whitespace after "bogus one":

123 +* If we don't have to fetch the tuple, just return a
124 +* bogus one 
125 +*/

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Restricting maximum keep segments by repslots

2017-09-07 Thread Kyotaro HORIGUCHI
Hello,

At Thu, 07 Sep 2017 14:12:12 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170907.141212.227032666.horiguchi.kyot...@lab.ntt.co.jp>
> > I would like a flag in pg_replication_slots, and possibly also a
> > numerical column that indicates how far away from the critical point
> > each slot is.  That would be great for a monitoring system.
> 
> Great! I'll do that right now.

Done.

In the attached patch on top of the previous patch, I added two
columns in pg_replication_slots, "live" and "distance". The first
indicates the slot will "live" after the next checkpoint. The
second shows the how many bytes checkpoint lsn can advance before
the slot will "die", or how many bytes the slot have lost after
"death".


Setting wal_keep_segments = 1 and max_slot_wal_keep_size = 16MB.

=# select slot_name, restart_lsn, pg_current_wal_lsn(), live, distance from 
pg_replication_slots;

slot_name | restart_lsn | pg_current_wal_lsn | live | distance  
---+-++--+---
 s1| 0/162D388   | 0/162D3C0  | t| 0/29D2CE8

This shows that checkpoint can advance 0x29d2ce8 bytes before the
slot will die even if the connection stalls.

 s1| 0/4001180   | 0/6FFF2B8  | t| 0/DB8

Just before the slot loses sync.

 s1| 0/4001180   | 0/70008A8  | f| 0/FFEE80

The checkpoint after this removes some required segments.

2017-09-07 19:04:07.677 JST [13720] WARNING:  restart LSN of replication slots 
is ignored by checkpoint
2017-09-07 19:04:07.677 JST [13720] DETAIL:  Some replication slots have lost 
required WAL segnents to continue by up to 1 segments.

If max_slot_wal_keep_size if not set (0), live is always true and
distance is NULL.

slot_name | restart_lsn | pg_current_wal_lsn | live | distance  
---+-++--+---
 s1| 0/4001180   | 0/73117A8  | t| 



- The name (or its content) of the new columns should be arguable.

- pg_replication_slots view takes LWLock on ControlFile and
  spinlock on XLogCtl for every slot. But seems difficult to
  reduce it..

- distance seems mitakenly becomes 0/0 for certain condition..

- The result seems almost right but more precise check needed.
  (Anyway it cannot be perfectly exact.);

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From ac47e250cf88b1279556e27c33e9f29806fdc04d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Sep 2017 19:13:22 +0900
Subject: [PATCH 2/2] Add monitoring aid for max_replication_slots.

Adds two columns "live" and "distance" in pg_replication_slot.
Setting max_slot_wal_keep_size, long-disconnected slots may lose sync.
The two columns shows how long a slot can live on or how many bytes a
slot have lost if max_slot_wal_keep_size is set.
---
 src/backend/access/transam/xlog.c| 80 
 src/backend/catalog/system_views.sql |  4 +-
 src/backend/replication/slotfuncs.c  | 16 +++-
 src/include/access/xlog.h|  1 +
 src/include/catalog/pg_proc.h|  2 +-
 5 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ae70d7d..c4c8307 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9324,6 +9324,86 @@ CreateRestartPoint(int flags)
 }
 
 /*
+ * Check if the record on the given lsn will be preserved at the next
+ * checkpoint.
+ *
+ * Returns true if it will be preserved. If distance is given, the distance
+ * from origin to the beginning of the first segment kept at the next
+ * checkpoint. It means margin when this function returns true and gap of lost
+ * records when false.
+ *
+ * This function should return the consistent result with KeepLogSeg.
+ */
+bool
+GetMarginToSlotSegmentLimit(XLogRecPtr restartLSN, uint64 *distance)
+{
+	XLogRecPtr currpos;
+	XLogRecPtr tailpos;
+	uint64 currSeg;
+	uint64 restByteInSeg;
+	uint64 restartSeg;
+	uint64 tailSeg;
+	uint64 keepSegs;
+
+	currpos = GetXLogWriteRecPtr();
+
+	LWLockAcquire(ControlFileLock, LW_SHARED);
+	tailpos = ControlFile->checkPointCopy.redo;
+	LWLockRelease(ControlFileLock);
+
+	/* Move the pointer to the beginning of the segment*/
+	XLByteToSeg(currpos, currSeg);
+	XLByteToSeg(restartLSN, restartSeg);
+	XLByteToSeg(tailpos, tailSeg);
+	restByteInSeg = 0;
+
+	Assert(wal_keep_segments >= 0);
+	Assert(max_slot_wal_keep_size_mb >= 0);
+
+	/*
+	 * WAL are removed by the unit of segment.
+	 */
+	keepSegs = wal_keep_segments + ConvertToXSegs(max_slot_wal_keep_size_mb);
+
+	/*
+	 * If the latest checkpoint's redo point is older than the current head
+	 * minus keep segments, the next checkpoint keeps the redo point's
+	 * segment. Elsewise use current head minus number of segments to keep.
+	 */
+	if (currSeg < tailSeg + keepSegs)
+	{
+		if (currSeg < keepSegs)
+			

Re: [HACKERS] Adding support for Default partition in partitioning

2017-09-07 Thread Ashutosh Bapat
On Wed, Sep 6, 2017 at 5:50 PM, Jeevan Ladhe
 wrote:
>
>>
>> I am wondering whether we could avoid call to get_default_partition_oid()
>> in
>> the above block, thus avoiding a sys cache lookup. The sys cache search
>> shouldn't be expensive since the cache should already have that entry, but
>> still if we can avoid it, we save some CPU cycles. The default partition
>> OID is
>> stored in pg_partition_table catalog, which is looked up in
>> RelationGetPartitionKey(), a function which precedes
>> RelationGetPartitionDesc()
>> everywhere. What if that RelationGetPartitionKey() also returns the
>> default
>> partition OID and the common caller passes it to
>> RelationGetPartitionDesc()?.
>
>
> The purpose here is to cross check the relid with partdefid stored in
> catalog
> pg_partitioned_table, though its going to be the same in the parents cache,
> I
> think its better that we retrieve it from the catalog syscache.
> Further, RelationGetPartitionKey() is a macro and not a function, so
> modifying
> the existing simple macro for this reason does not sound a good idea to me.
> Having said this I am open to ideas here.

Sorry, I meant RelationBuildPartitionKey() and
RelationBuildPartitionDesc() instead of RelationGetPartitionKey() and
RelationGetPartitionDesc() resp.

>
>>
>> +/* A partition cannot be attached if there exists a default partition
>> */
>> +defaultPartOid = get_default_partition_oid(RelationGetRelid(rel));
>> +if (OidIsValid(defaultPartOid))
>> +ereport(ERROR,
>> +(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
>> + errmsg("cannot attach a new partition to table
>> \"%s\" having a default partition",
>> +RelationGetRelationName(rel;
>> get_default_partition_oid() searches the catalogs, which is not needed
>> when we
>> have relation descriptor of the partitioned table (to which a new
>> partition is
>> being attached). You should get the default partition OID from partition
>> descriptor. That will be cheaper.
>
>
> Something like following can be done here:
> /* A partition cannot be attached if there exists a default partition */
> if (partition_bound_has_default(rel->partdesc->boundinfo))
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
>  errmsg("cannot attach a new partition to table \"%s\"
> having a default partition",
> RelationGetRelationName(rel;
>
> But, partition_bound_has_default() is defined in partition.c and not in
> partition.h. This is done that way because boundinfo is not available in
> partition.h. Further, this piece of code is removed in next patch where we
> extend default partition support to add/attach partition even when default
> partition exists. So, to me I don’t see much of the correction issue here.

If the code is being removed, I don't think we should sweat too much
about it. Sorry for the noise.

>
> Another way to get around this is, we can define another version of
> get_default_partition_oid() something like
> get_default_partition_oid_from_parent_rel()
> in partition.c which looks around in relcache instead of catalog and returns
> the
> oid of default partition, or get_default_partition_oid() accepts both parent
> OID,
> and parent ‘Relation’ rel, if rel is not null look into relcahce and return,
> else search from catalog using OID.

I think we should define a function to return default partition OID
from partition descriptor and make it extern. Define a wrapper which
accepts Relation and returns calls this function to get default
partition OID from partition descriptor. The wrapper will be called
only on an open Relation, wherever it's available.


>
>> I haven't gone through the full patch yet, so there may be more
>> comments here. There is some duplication of code in
>> check_default_allows_bound() and ValidatePartitionConstraints() to
>> scan the children of partition being validated. The difference is that
>> the first one scans the relations in the same function and the second
>> adds them to work queue. May be we could use
>> ValidatePartitionConstraints() to scan the relation or add to the
>> queue based on some input flag may be wqueue argument itself. But I
>> haven't thought through this completely. Any thoughts?
>
>
> check_default_allows_bound() is called only from DefineRelation(),
> and not for alter command. I am not really sure how can we use
> work queue for create command.


No, we shouldn't use work queue for CREATE command. We should extract
the common code into a function to be called from
check_default_allows_bound() and ValidatePartitionConstraints(). To
that function we pass a flag (or the work queue argument itself),
which decides whether to add a work queue item or scan the relation
directly.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-07 Thread Alvaro Herrera
After reading this discussion, I agree that pg_rewind needs to become
smarter in order to be fully useful in production environments; right
now there are many warts and corner cases that did not seem to have been
considered during the initial development (which I think is all right,
taking into account that its mere concept was complicated enough; so we
need not put any blame on the original developers, rather the contrary).

I think we need to make the program simple to use (i.e. not have the
user write shell globs for the common log file naming patterns) while
remaining powerful (i.e. not forcibly copy any files that do not match
hardcoded globs).  Is the current dry-run mode enough to give the user
peace of mind regarding what would be done in terms of testing globs
etc?  If not, maybe the debug mode needs to be improved (for instance,
have it report the file size for each file that would be copied;
otherwise you may not notice it's going to copy the 20GB log file until
it's too late ...)

Now, in order for any of this to happen, there will need to be a
champion to define what the missing pieces are, write all those patches
and see them through the usual (cumbersome, slow) procedure.  Are you,
Chris, willing to take the lead on that?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-07 Thread Ashutosh Bapat
On Thu, Sep 7, 2017 at 4:32 PM, Antonin Houska  wrote:
> Ashutosh Bapat  wrote:
>
>> On Fri, Sep 1, 2017 at 6:05 PM, Antonin Houska  wrote:
>> >
>> >
>> >
>> > * get_partitioned_child_rels_for_join()
>> >
>> > I think the Assert() statement is easier to understand inside the loop, see
>> > the assert.diff attachment.
>
>> The assert at the end of function also checks that we have got
>> child_rels lists for all the parents passed in.
>
> Really? I can imagine that some instances of PartitionedChildRelInfo have the
> child_rels list empty, while other ones have these lists long enough to
> compensate for the empty lists.
>

That isn't true. Each child_rels list will at least have one entry.
Please see get_partitioned_child_rels().

>> >
>> >
>> > * have_partkey_equi_join()
>> >
>> > As the function handles generic join, this comment doesn't seem to me
>> > relevant:
>> >
>> > /*
>> >  * The equi-join between partition keys is strict if equi-join between
>> >  * at least one partition key is using a strict operator. See
>> >  * explanation about outer join reordering identity 3 in
>> >  * optimizer/README
>> >  */
>> > strict_op = op_strict(opexpr->opno);
>>
>> What in that comment is not exactly relevant?
>
> Basically I don't understand why you mention join reordering here. The join
> ordering questions must all have been resolved by the time
> have_partkey_equi_join() is called.

I am referring to a particular section in README which talks about the
relation between strict operator and legal join order.

>
>> >
>> > And I think the function can return true even if strict_op is false for all
>> > the operators evaluated in the loop.
>>
>> I think it does that. Do you have a case where it doesn't?
>
> Here I refer to this part of the comment above:
>
> "... if equi-join between at least one partition key is using a strict
> operator."
>
> My understanding of the code (especially match_expr_to_partition_keys) is that
> no operator actually needs to be strict as long as each operator involved in
> the join matches at least one non-nullable expression on both sides of the
> join.

I don't think so. A strict operator returns NULL when either of the
inputs is NULL. We can not say so for non-strict operators, which may
deem NULL and non-NULL arguments as equal, even though that looks
insane.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] path toward faster partition pruning

2017-09-07 Thread Amit Langote
Forgot to mention a couple of important points about the relation of some
of the patches here to the patches and discussion at the
partitionwise-join thread [1].

On 2017/09/06 19:38, Amit Langote wrote:
> [PATCH 1/5] Expand partitioned inheritance in a non-flattened manner
> 
> This will allow us to perform scan and join planning in a per partition
> sub-tree manner, with each sub-tree's root getting its own RelOptInfo.
> Previously, only the root of the whole partition tree would get a
> RelOptInfo, along with the leaf partitions, with each leaf partition's
> AppendRelInfo pointing to the former as its parent.
> 
> This is essential, because we will be doing partition-pruning for every
> partitioned table in the tree by matching query's scan keys with its
> partition key.  We won't be able to do that if the intermediate
> partitioned tables didn't have a RelOptInfo.

There is a patch in the Ashutosh's posted series of patches, which does
more or less the same thing that this patch does.  He included it in his
series of patches, because, IIUC, the main partitionwise-join planning
logic that one of the later patch implements depends on being able to
consider applying that new planning technique individually for every
partition sub-tree, instead of just at the whole tree root.

One notable difference from his patch is that while his patch will expand
in non-flattened manner even in the case where the parent is the result
relation of a query, my patch doesn't in that case, because the new
partition-pruning technique cannot be applied to inheritance parent that
is a result relation, for example,

update partitioned_table set ...

And AFAICS, partitionwise-join cannot be applied to such a parent either.

Note however that if there are other instances of the same partitioned
table (in the FROM list of an update statement) or other partitioned
tables in the query, they will be expanded in a non-flattened manner,
because they are themselves not the result relations of the query.  So,
the new partition-pruning and (supposedly) partitionwise-join can be
applied for those other partitioned tables.

> [PATCH 2/5] WIP: planner-side changes for partition-pruni[...]
> 
> Also, it adds certain fields to RelOptInfos of the partitioned tables that
> reflect its partitioning properties.

There is something called PartitionScheme, which is a notion one of the
Ashutosh's patches invented that this patch incorporates as one of the new
fields in RelOptInfo that I mentioned above (also a list of
PartitionScheme's in the planner-global PlannerInfo).  Although,
PartitionScheme is not significant for the task of partition-pruning
itself, it's still useful.  On Ashutosh's suggestion, I adopted the same
in my patch series, so that the partition-wise join patch doesn't end up
conflicting with the partition-pruning patch while trying to implement the
same and can get straight to the task of implementing partition-wise joins.

The same patch in the partition-wise join patch series that introduces
PartitionScheme, also introduces a field in the RelOptInfo called
partexprs, which records the partition key expressions.  Since,
partition-pruning has use for the same, I incorporated the same here;
also, in a format that Ashutosh's partition-wise patch can use directly,
instead of the latter having to hack it again to make it suitable to store
partition key expressions of joinrels.  Again, that's to minimize
conflicts and let his patch just find the field to use as is, instead of
implementing it first.

Lastly, this patch introduces a PartitionAppendInfo in a partitioned
table's RelOptInfo that stores AppendRelInfos of the partitions (child
relations) that survive partition-pruning, which serves to identify those
partitions' RelOptInfos.  Along with the identities of surviving
partitions, it also stores the partitioning configuration of the
partitioned table after partitions are pruned.  That includes
partdesc->boundinfo (which is simply a pointer into the table's relcache)
and a few other fields that are set by partition-pruning code, such
min_datum_index, max_datum_index, null_partition_chosen, that describe the
result after pruning.  So, for two partitioned tables being joined, if the
boundinfos match per partition_bounds_equal() and these other fields
match, they can be safely partition-wise joined.

[1]
https://www.postgresql.org/message-id/CAFjFpRfRDhWp%3DoguNjyzN%3DNMoOD%2BRCC3wS%2Bb%2BxbGKwKUk0dRKg%40mail.gmail.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Aggregation push-down

2017-09-07 Thread Ashutosh Bapat
On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska  wrote:
> Antonin Houska  wrote:
>
>> Antonin Houska  wrote:
>>
>> > This is a new version of the patch I presented in [1].
>>
>> Rebased.
>>
>> cat .git/refs/heads/master
>> b9a3ef55b253d885081c2d0e9dc45802cab71c7b
>
> This is another version of the patch.
>
> Besides other changes, it enables the aggregation push-down for postgres_fdw,
> although only for aggregates whose transient aggregation state is equal to the
> output type. For other aggregates (like avg()) the remote nodes will have to
> return the transient state value in an appropriate form (maybe bytea type),
> which does not depend on PG version.

Having transient aggregation state type same as output type doesn't
mean that we can feed the output of remote aggregation as is to the
finalization stage locally or finalization is not needed at all. I am
not sure why is that test being used to decide whether or not to push
an aggregate (I assume they are partial aggregates) to the foreign
server. postgres_fdw doesn't have a feature to fetch partial aggregate
results from the foreign server. Till we do that, I don't think this
patch can push any partial aggregates down to the foreign server.

>
> shard.tgz demonstrates the typical postgres_fdw use case. One query shows base
> scans of base relation's partitions being pushed to shard nodes, the other
> pushes down a join and performs aggregation of the join result on the remote
> node. Of course, the query can only references one particular partition, until
> the "partition-wise join" [1] patch gets committed and merged with this my
> patch.

Right. Until then joins will not have children.

>
> One thing I'm not sure about is whether the patch should remove
> GetForeignUpperPaths function from FdwRoutine, which it essentially makes
> obsolete. Or should it only be deprecated so far? I understand that
> deprecation is important for "ordinary SQL users", but FdwRoutine is an
> interface for extension developers, so the policy might be different.

I doubt if that's correct. We can do that only when we know that all
kinds aggregates/grouping can be pushed down the join tree, which
looks impossible to me. Apart from that that FdwRoutine handles all
kinds of upper relations like windowing, distinct, ordered which are
not pushed down by this set of patches.

Here's review of the patchset
The patches compile cleanly when applied together.

Testcase "limit" fails in make check.

This is a pretty large patchset and the patches are divided by stages in the
planner rather than functionality. IOW, no single patch provides a full
functionality. Reviewing and probably committing such patches is not easy and
may take quite long. Instead, if the patches are divided by functionality, i.e.
each patch provides some functionality completely, it will be easy to review
and commit such patches with each commit giving something useful. I had
suggested breaking patches on that line at [1]. The patches also refactor
existing code. It's better to move such refactoring in a patch/es by itself.

The patches need more comments, explaining various decisions e.g.
 if (joinrel)
 {
 /* Create GatherPaths for any useful partial paths for rel */
-generate_gather_paths(root, joinrel);
+generate_gather_paths(root, joinrel, false);

 /* Find and save the cheapest paths for this joinrel */
 set_cheapest(joinrel);
For base relations, the patch calls generate_gather_paths() with
grouped as true and
false, but for join relations it doesn't do that. There is no comment
explaining why.
OR
in the following code
+/*
+ * Do partial aggregation at base relation level if the relation is
+ * eligible for it.
+ */
+if (rel->gpi != NULL)
+create_grouped_path(root, rel, path, false, true, AGG_HASHED);
Why not to consider AGG_SORTED paths?

The patch maintains an extra rel target, two pathlists, partial pathlist and
non-partial pathlist for grouping in RelOptInfo. These two extra
pathlists require "grouped" argument to be passed to a number of functions.
Instead probably it makes sense to create a RelOptInfo for aggregation/grouping
and set pathlist and partial_pathlist in that RelOptInfo. That will also make a
place for keeping grouped estimates.

For placeholders we have two function add_placeholders_to_base_rels() and
add_placeholders_to_joinrel(). We have add_grouping_info_to_base_rels(), but do
not have corresponding function for joinrels. How do we push aggregates
involving two or more relations to the corresponding joinrels?

Some minor assorted comments
Avoid useless hunk.
@@ -279,8 +301,8 @@ add_paths_to_joinrel(PlannerInfo *root,
  * joins, because there may be no other alternative.
  */
 if (enable_hashjoin || jointype == JOIN_FULL)
-hash_inner_and_outer(root, joinrel, outerrel, innerrel,
- 

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-07 Thread Antonin Houska
Ashutosh Bapat  wrote:

> On Fri, Sep 1, 2017 at 6:05 PM, Antonin Houska  wrote:
> >
> >
> >
> > * get_partitioned_child_rels_for_join()
> >
> > I think the Assert() statement is easier to understand inside the loop, see
> > the assert.diff attachment.

> The assert at the end of function also checks that we have got
> child_rels lists for all the parents passed in.

Really? I can imagine that some instances of PartitionedChildRelInfo have the
child_rels list empty, while other ones have these lists long enough to
compensate for the empty lists.

> >
> >
> > * have_partkey_equi_join()
> >
> > As the function handles generic join, this comment doesn't seem to me
> > relevant:
> >
> > /*
> >  * The equi-join between partition keys is strict if equi-join between
> >  * at least one partition key is using a strict operator. See
> >  * explanation about outer join reordering identity 3 in
> >  * optimizer/README
> >  */
> > strict_op = op_strict(opexpr->opno);
> 
> What in that comment is not exactly relevant?

Basically I don't understand why you mention join reordering here. The join
ordering questions must all have been resolved by the time
have_partkey_equi_join() is called.

> >
> > And I think the function can return true even if strict_op is false for all
> > the operators evaluated in the loop.
> 
> I think it does that. Do you have a case where it doesn't?

Here I refer to this part of the comment above:

"... if equi-join between at least one partition key is using a strict
operator."

My understanding of the code (especially match_expr_to_partition_keys) is that
no operator actually needs to be strict as long as each operator involved in
the join matches at least one non-nullable expression on both sides of the
join.

> > * match_expr_to_partition_keys()
> >
> > I'm not sure this comment is clear enough:
> >
> > /*
> >  * If it's a strict equi-join a NULL partition key on one side will
> >  * not join a NULL partition key on the other side. So, rows with NULL
> >  * partition key from a partition on one side can not join with those
> >  * from a non-matching partition on the other side. So, search the
> >  * nullable partition keys as well.
> >  */
> > if (!strict_op)
> > continue;
> >
> > My understanding of the problem of NULL values generated by outer join is:
> > these NULL values --- if evaluated by non-strict expression --- can make row
> > of N-th partition on one side of the join match row(s) of *other than* N-th
> > partition(s) on the other side. Thus the nullable input expressions may only
> > be evaluated by strict operators. I think it'd be clearer if you stressed 
> > that
> > (undesired) *match* of partition keys can be a problem, rather than mismatch
> 
> Sorry, I am not able to understand this. To me it looks like my
> wording conveys what you are saying.

I just tried to expreess the idea in a way that is clearer to me. I think we
both mean the same. Not sure I should spend more effort on another version of
the comment.

> > If you insist on your wording, then I think you should at least move the
> > comment below to the part that only deals with strict operators.
> 
> Done.

o.k.

> >
> > * map_and_merge_partitions()
> >
> > Besides a few changes proposed in map_and_merge_partitions.diff (a few of 
> > them
> > to suppress compiler warnings) I think that this part needs more thought:
> >
> > {
> > Assert(mergemap1[index1] != mergemap2[index2] &&
> >mergemap1[index1] >= 0 && mergemap2[index2] >= 0);
> >
> > /*
> >  * Both the partitions map to different merged partitions. This
> >  * means that multiple partitions from one relation matches to 
> > one
> >  * partition from the other relation. Partition-wise join does 
> > not
> >  * handle this case right now, since it requires ganging 
> > multiple
> >  * partitions together (into one RelOptInfo).
> >  */
> > merged_index = -1;
> > }
> >
> > I could hit this path with the following test:
> >
> > CREATE TABLE a(i int) PARTITION BY LIST(i);
> > CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2);
> > CREATE TABLE b(j int) PARTITION BY LIST(j);
> > CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2);
> >
> > SET enable_partition_wise_join TO on;
> >
> > SELECT  *
> > FROMa
> > FULL JOIN
> > b ON i = j;
> >
> > I don't think there's a reason not to join a_0 partition to b_0, is there?
> 
> With the latest patchset I am seeing that partition-wise join is used
> in this case. I have started a new thread [1] for advanced partition
> matching patches.

What plan do you get, with the patches from

https://www.postgresql.org/message-id/cafjfprfdxpusu0pxon3dkcr8wndjkaxlzhuvax_laod0tgc...@mail.gmail.com

I still see the join above Append, not below:

 

Re: [HACKERS] UPDATE of partition key

2017-09-07 Thread Amit Khandekar
On 3 September 2017 at 17:10, Amit Khandekar  wrote:

> After recent commit 30833ba154, now the partitions are expanded in
> depth-first order. It didn't seem worthwhile rebasing my partition
> walker changes onto the latest code. So in the attached patch, I have
> removed all the partition walker changes. But
> RelationGetPartitionDispatchInfo() traverses in breadth-first order,
> which is different than the update result rels order (because
> inheritance expansion order is depth-first). So, in order to make the
> tuple-routing-related leaf partitions in the same order as that of the
> update result rels, we would have to make changes in
> RelationGetPartitionDispatchInfo(), which I am not sure whether it is
> going to be done as part of the thread "expanding inheritance in
> partition bound order" [1]. For now, in the attached patch, I have
> reverted back to the hash table method to find the leaf partitions in
> the update result rels.
>
> [1] 
> https://www.postgresql.org/message-id/CAJ3gD9eyudCNU6V-veMme%2BeyzfX_ey%2BgEzULMzOw26c3f9rzdg%40mail.gmail.com

As mentioned by Amit Langote in the above mail thread, he is going to
do changes for making RelationGetPartitionDispatchInfo() return the
leaf partitions in depth-first order. Once that is done, I will then
remove the hash table method for finding leaf partitions in update
result rels, and instead use the earlier efficient method that takes
advantage of the fact that update result rels and leaf partitions are
in the same order.

>
> Thanks
> -Amit Khandekar



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Secondary index access optimizations

2017-09-07 Thread Thomas Munro
On Sun, Sep 3, 2017 at 4:34 AM, Konstantin Knizhnik
 wrote:
> Thank you for review.
> I attached new version of the patch with
> remove_restrictions_implied_by_constraints() function.
> Concerning failed tests - this is actually result of this optimization:
> extra filter conditions are removed from query plans.
> Sorry, I have not included updated version of expected test output files to
> the patch.
> Now I did it.

A regression test under contrib/postgres_fdw now fails:

- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T
1" WHERE (("C 1" IS NOT NULL))
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"

(("C 1" IS NOT NULL)) is indeed redundant in that case, because column
"C 1" was declared to be NOT NULL.  But:

1.  Do we want to go this far?  Note that this is not involving
inheritance and constraint exclusion.  I don't immediately see any
reason why not, but I'm not sure.

2.  If yes, then this postgres_fdw test should be changed, because I
think it was trying to demonstrate that IS NOT NULL expressions are
sent to remote databases -- it would need to be changed so that it
tries that with a column that is actually nullable.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-09-07 Thread Jeevan Ladhe
On Thu, Sep 7, 2017 at 3:15 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> On Wed, Sep 6, 2017 at 5:25 PM, Jeevan Ladhe <
> jeevan.la...@enterprisedb.com> wrote:
>
>> Hi,
>>
>> Attached is the rebased set of patches.
>> Robert has committed[1] patch 0001 in V26 series, hence the patch
>> numbering
>> in V27 is now decreased by 1 for each patch as compared to V26.
>>
>
> Hi,
>
> I have applied v27 patches and while testing got below observation.
>
> Observation: in below partition table, d1 constraints not allowing NULL to
> be inserted in b column
> but I am able to insert it.
>
> steps to reproduce:
> create table d0 (a int, b int) partition by range(a,b);
> create table d1 partition of d0 for values from (0,0) to
> (maxvalue,maxvalue);
>
> postgres=# insert into d0 values (0,null);
> INSERT 0 1
> postgres=# \d+ d1
> Table "public.d1"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> +-+---+--+-+
> -+--+-
>  a  | integer |   |  | | plain
> |  |
>  b  | integer |   |  | | plain
> |  |
> Partition of: d0 FOR VALUES FROM (0, 0) TO (MAXVALUE, MAXVALUE)
> Partition constraint: ((a IS NOT NULL) AND *(b IS NOT NULL) *AND ((a > 0)
> OR ((a = 0) AND (b >= 0
>
> postgres=# select tableoid::regclass,* from d0;
>  tableoid | a | b
> --+---+---
>  *d1   | 0 |  *
> (1 row)
>

Good catch. Thanks Rajkumar.
This seems to be happening because of the following changes made in
get_partition_for_tuple() for default range partition support as part of
patch 0002.

@@ -1971,27 +2204,10 @@ get_partition_for_tuple(PartitionDispatch *pd,
  ecxt->ecxt_scantuple = slot;
  FormPartitionKeyDatum(parent, slot, estate, values, isnull);

- if (key->strategy == PARTITION_STRATEGY_RANGE)
- {
- /*
- * Since we cannot route tuples with NULL partition keys through a
- * range-partitioned table, simply return that no partition exists
- */
- for (i = 0; i < key->partnatts; i++)
- {
- if (isnull[i])
- {
- *failed_at = parent;
- *failed_slot = slot;
- result = -1;
- goto error_exit;
- }
- }
- }

Instead of getting rid of this. If there isn't a default partition then
we still do not have any range partition to route a null partition
key and the routing should fail.

I will work on a fix and send a patch shortly.

Regards,
Jeevan Ladhe


Re: [HACKERS] Adding support for Default partition in partitioning

2017-09-07 Thread Rajkumar Raghuwanshi
On Wed, Sep 6, 2017 at 5:25 PM, Jeevan Ladhe 
wrote:

> Hi,
>
> Attached is the rebased set of patches.
> Robert has committed[1] patch 0001 in V26 series, hence the patch numbering
> in V27 is now decreased by 1 for each patch as compared to V26.
>

Hi,

I have applied v27 patches and while testing got below observation.

Observation: in below partition table, d1 constraints not allowing NULL to
be inserted in b column
but I am able to insert it.

steps to reproduce:
create table d0 (a int, b int) partition by range(a,b);
create table d1 partition of d0 for values from (0,0) to
(maxvalue,maxvalue);

postgres=# insert into d0 values (0,null);
INSERT 0 1
postgres=# \d+ d1
Table "public.d1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target
| Description
+-+---+--+-+-+--+-
 a  | integer |   |  | | plain   |
|
 b  | integer |   |  | | plain   |
|
Partition of: d0 FOR VALUES FROM (0, 0) TO (MAXVALUE, MAXVALUE)
Partition constraint: ((a IS NOT NULL) AND *(b IS NOT NULL) *AND ((a > 0)
OR ((a = 0) AND (b >= 0

postgres=# select tableoid::regclass,* from d0;
 tableoid | a | b
--+---+---
 *d1   | 0 |  *
(1 row)


Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-09-07 Thread Alvaro Herrera
Peter Eisentraut wrote:
> On 9/4/17 06:03, Alvaro Herrera wrote:
> > Tom Lane wrote:
> >> Michael Paquier  writes:
> >>> I don't like breaking the abstraction of pg_log() with the existing
> >>> flags with some kind of pg_debug() layer. The set of APIs present now
> >>> in pg_rewind for logging is nice to have, and I think that those debug
> >>> messages should be translated. So what about the attached?
> >>
> >> Your point about INT64_FORMAT not necessarily working with fprintf
> >> is an outstanding reason not to keep it like it is.  I've not reviewed
> >> this patch in detail but I think this is basically the way to fix it.
> > 
> > Actually this code goes throgh vsnprintf, not fprintf, which should be
> > safe, so I removed that part of the comment, and pushed.
> 
> Is there a reason this was not backpatched to 9.5?

Done.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tuple-routing for certain partitioned tables not working as expected

2017-09-07 Thread Etsuro Fujita

On 2017/08/30 17:20, Etsuro Fujita wrote:
Here is a patch to skip the CheckValidResultRel checks for a 
target table if it's a foreign partition to perform tuple-routing for, 
as proposed by Robert.


In the patch, to skip the checks, I passed to CheckValidResultRel a new 
flag indicating whether the target relation is a partition to do 
tuple-routing for, but while updating another patch for tuple-routing 
for foreign partitions in PG11, I noticed that it would be better to 
pass ResultRelInfo than that flag.  The reason is: (1) we can see 
whether the relation is such a partition by looking at the 
ResultRelInfo's ri_PartitionRoot, instead of that flag, and (2) this is 
for tuple-routing for foreign partitions, but we could extend that 
function with that argument to do the checks without throwing an error 
and save the result in a new member of ResultRelInfo (eg, 
ri_PartitionIsValid) so that we can use that info to determine in 
ExecInsert whether a foreign partition chosen by ExecFindPartition is 
valid for tuple-routing.  So, I updated the patch that way.  Attached is 
an updated version of the patch.


(I discussed about lazy initialization of partition ResultRelInfos 
before, but I changed my mind because I noticed that the change to 
EXPLAIN for INSERT into a partitioned table, which I'd like to propose 
in the tuple-routing-for-foereign-partitions patch, needs partition 
ResultRelInfos.)


Best regards,
Etsuro Fujita
*** /dev/null
--- b/contrib/file_fdw/data/list1.csv
***
*** 0 
--- 1,2 
+ 1,foo
+ 1,bar
*** /dev/null
--- b/contrib/file_fdw/data/list2.bad
***
*** 0 
--- 1,2 
+ 2,baz
+ 1,qux
*** /dev/null
--- b/contrib/file_fdw/data/list2.csv
***
*** 0 
--- 1,2 
+ 2,baz
+ 2,qux
*** a/contrib/file_fdw/input/file_fdw.source
--- b/contrib/file_fdw/input/file_fdw.source
***
*** 162,167  SELECT tableoid::regclass, * FROM agg FOR UPDATE;
--- 162,188 
  ALTER FOREIGN TABLE agg_csv NO INHERIT agg;
  DROP TABLE agg;
  
+ -- declarative partitioning tests
+ SET ROLE regress_file_fdw_superuser;
+ CREATE TABLE pt (a int, b text) partition by list (a);
+ CREATE FOREIGN TABLE p1 partition of pt for values in (1) SERVER file_server
+ OPTIONS (format 'csv', filename '@abs_srcdir@/data/list1.csv', delimiter ',');
+ CREATE TABLE p2 partition of pt for values in (2);
+ SELECT tableoid::regclass, * FROM pt;
+ SELECT tableoid::regclass, * FROM p1;
+ SELECT tableoid::regclass, * FROM p2;
+ COPY pt FROM '@abs_srcdir@/data/list2.bad' with (format 'csv', delimiter 
','); -- ERROR
+ COPY pt FROM '@abs_srcdir@/data/list2.csv' with (format 'csv', delimiter ',');
+ SELECT tableoid::regclass, * FROM pt;
+ SELECT tableoid::regclass, * FROM p1;
+ SELECT tableoid::regclass, * FROM p2;
+ INSERT INTO pt VALUES (1, 'xyzzy'); -- ERROR
+ INSERT INTO pt VALUES (2, 'xyzzy');
+ SELECT tableoid::regclass, * FROM pt;
+ SELECT tableoid::regclass, * FROM p1;
+ SELECT tableoid::regclass, * FROM p2;
+ DROP TABLE pt;
+ 
  -- privilege tests
  SET ROLE regress_file_fdw_superuser;
  SELECT * FROM agg_text ORDER BY a;
*** a/contrib/file_fdw/output/file_fdw.source
--- b/contrib/file_fdw/output/file_fdw.source
***
*** 289,294  SELECT tableoid::regclass, * FROM agg FOR UPDATE;
--- 289,375 
  
  ALTER FOREIGN TABLE agg_csv NO INHERIT agg;
  DROP TABLE agg;
+ -- declarative partitioning tests
+ SET ROLE regress_file_fdw_superuser;
+ CREATE TABLE pt (a int, b text) partition by list (a);
+ CREATE FOREIGN TABLE p1 partition of pt for values in (1) SERVER file_server
+ OPTIONS (format 'csv', filename '@abs_srcdir@/data/list1.csv', delimiter ',');
+ CREATE TABLE p2 partition of pt for values in (2);
+ SELECT tableoid::regclass, * FROM pt;
+  tableoid | a |  b  
+ --+---+-
+  p1   | 1 | foo
+  p1   | 1 | bar
+ (2 rows)
+ 
+ SELECT tableoid::regclass, * FROM p1;
+  tableoid | a |  b  
+ --+---+-
+  p1   | 1 | foo
+  p1   | 1 | bar
+ (2 rows)
+ 
+ SELECT tableoid::regclass, * FROM p2;
+  tableoid | a | b 
+ --+---+---
+ (0 rows)
+ 
+ COPY pt FROM '@abs_srcdir@/data/list2.bad' with (format 'csv', delimiter 
','); -- ERROR
+ ERROR:  cannot route inserted tuples to a foreign table
+ CONTEXT:  COPY pt, line 2: "1,qux"
+ COPY pt FROM '@abs_srcdir@/data/list2.csv' with (format 'csv', delimiter ',');
+ SELECT tableoid::regclass, * FROM pt;
+  tableoid | a |  b  
+ --+---+-
+  p1   | 1 | foo
+  p1   | 1 | bar
+  p2   | 2 | baz
+  p2   | 2 | qux
+ (4 rows)
+ 
+ SELECT tableoid::regclass, * FROM p1;
+  tableoid | a |  b  
+ --+---+-
+  p1   | 1 | foo
+  p1   | 1 | bar
+ (2 rows)
+ 
+ SELECT tableoid::regclass, * FROM p2;
+  tableoid | a |  b  
+ --+---+-
+  p2   | 2 | baz
+  p2   | 2 | qux
+ (2 rows)
+ 
+ INSERT INTO pt VALUES (1, 'xyzzy'); -- ERROR
+ ERROR:  cannot route inserted tuples to a foreign table
+ INSERT INTO pt VALUES (2, 'xyzzy');
+ SELECT 

Re: [HACKERS] WIP: Aggregation push-down

2017-09-07 Thread Jeevan Chalke
Hi Antonin,

To understand the feature you have proposed, I have tried understanding
your patch. Here are few comments so far on it:

1.
+ if (aggref->aggvariadic ||
+ aggref->aggdirectargs || aggref->aggorder ||
+ aggref->aggdistinct || aggref->aggfilter)

I did not understand, why you are not pushing aggregate in above cases?
Can you please explain?

2. "make check" in contrib/postgres_fdw crashes.

  SELECT COUNT(*) FROM ft1 t1;
! server closed the connection unexpectedly
!   This probably means the server terminated abnormally
!   before or while processing the request.
! connection to server was lost

>From your given setup, if I wrote a query like:
EXPLAIN VERBOSE SELECT count(*) FROM orders_0;
it crashes.

Seems like missing few checks.

3. In case of pushing partial aggregation on the remote side, you use schema
named "partial", I did not get that change. If I have AVG() aggregate,
then I end up getting an error saying
"ERROR:  schema "partial" does not exist".

4. I am not sure about the code changes related to pushing partial
aggregate on the remote side. Basically, you are pushing a whole aggregate
on the remote side and result of that is treated as partial on the
basis of aggtype = transtype. But I am not sure that is correct unless
I miss something here. The type may be same but the result may not.

5. There are lot many TODO comments in the patch-set, are you working
on those?

Thanks


On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska  wrote:

> Antonin Houska  wrote:
>
> > Antonin Houska  wrote:
> >
> > > This is a new version of the patch I presented in [1].
> >
> > Rebased.
> >
> > cat .git/refs/heads/master
> > b9a3ef55b253d885081c2d0e9dc45802cab71c7b
>
> This is another version of the patch.
>
> Besides other changes, it enables the aggregation push-down for
> postgres_fdw,
> although only for aggregates whose transient aggregation state is equal to
> the
> output type. For other aggregates (like avg()) the remote nodes will have
> to
> return the transient state value in an appropriate form (maybe bytea type),
> which does not depend on PG version.
>
> shard.tgz demonstrates the typical postgres_fdw use case. One query shows
> base
> scans of base relation's partitions being pushed to shard nodes, the other
> pushes down a join and performs aggregation of the join result on the
> remote
> node. Of course, the query can only references one particular partition,
> until
> the "partition-wise join" [1] patch gets committed and merged with this my
> patch.
>
> One thing I'm not sure about is whether the patch should remove
> GetForeignUpperPaths function from FdwRoutine, which it essentially makes
> obsolete. Or should it only be deprecated so far? I understand that
> deprecation is important for "ordinary SQL users", but FdwRoutine is an
> interface for extension developers, so the policy might be different.
>
> [1] https://commitfest.postgresql.org/14/994/
>
> Any feedback is appreciated.
>
> --
> Antonin Houska
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de, http://www.cybertec.at
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Parallel Append implementation

2017-09-07 Thread Amit Khandekar
On 7 September 2017 at 13:40, Rafia Sabih  wrote:
> On Wed, Aug 30, 2017 at 5:32 PM, Amit Khandekar  
> wrote:
>> Hi Rafia,
>>
>> On 17 August 2017 at 14:12, Amit Khandekar  wrote:
>>> But for all of the cases here, partial
>>> subplans seem possible, and so even on HEAD it executed Partial
>>> Append. So between a Parallel Append having partial subplans and a
>>> Partial Append having partial subplans , the cost difference would not
>>> be significant. Even if we assume that Parallel Append was chosen
>>> because its cost turned out to be a bit cheaper, the actual
>>> performance gain seems quite large as compared to the expected cost
>>> difference. So it might be even possible that the performance gain
>>> might be due to some other reasons. I will investigate this, and the
>>> other queries.
>>>
>>
>> I ran all the queries that were showing performance benefits in your
>> run. But for me, the ParallelAppend benefits are shown only for plans
>> that use Partition-Wise-Join.
>>
>> For all the queries that use only PA plans but not PWJ plans, I got
>> the exact same plan for HEAD as for PA+PWJ patch, except that for the
>> later, the Append is a ParallelAppend. Whereas, for you, the plans
>> have join-order changed.
>>
>> Regarding actual costs; consequtively, for me the actual-cost are more
>> or less the same for HEAD and PA+PWJ. Whereas, for your runs, you have
>> quite different costs naturally because the plans themselves are
>> different on head versus PA+PWJ.
>>
>> My PA+PWJ plan outputs (and actual costs) match exactly what you get
>> with PA+PWJ patch. But like I said, I get the same join order and same
>> plans (and actual costs) for HEAD as well (except
>> ParallelAppend=>Append).
>>
>> May be, if you have the latest HEAD code with your setup, you can
>> yourself check some of the queries again to see if they are still
>> seeing higher costs as compared to PA ? I suspect that some changes in
>> latest code might be causing this discrepancy; because when I tested
>> some of the explains with a HEAD-branch server running with your
>> database, I got results matching PA figures.
>>
>> Attached is my explain-analyze outputs.
>>
>
> Strange. Please let me know what was the commit-id you were
> experimenting on. I think we need to investigate this a little
> further.

Sure. I think the commit was b5c75fec. It was sometime in Aug 30 when
I ran the tests. But you may try on latest head.

> Additionally I want to point that I also applied patch [1],
> which I forgot to mention before.

Yes , I also had applied that patch over PA+PWJ.

>
> [1]  
> https://www.postgresql.org/message-id/CAEepm%3D3%3DNHHko3oOzpik%2BggLy17AO%2Bpx3rGYrg3x_x05%2BBr9-A%40mail.gmail.com

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-09-07 Thread Rafia Sabih
On Wed, Aug 30, 2017 at 5:32 PM, Amit Khandekar  wrote:
> Hi Rafia,
>
> On 17 August 2017 at 14:12, Amit Khandekar  wrote:
>> But for all of the cases here, partial
>> subplans seem possible, and so even on HEAD it executed Partial
>> Append. So between a Parallel Append having partial subplans and a
>> Partial Append having partial subplans , the cost difference would not
>> be significant. Even if we assume that Parallel Append was chosen
>> because its cost turned out to be a bit cheaper, the actual
>> performance gain seems quite large as compared to the expected cost
>> difference. So it might be even possible that the performance gain
>> might be due to some other reasons. I will investigate this, and the
>> other queries.
>>
>
> I ran all the queries that were showing performance benefits in your
> run. But for me, the ParallelAppend benefits are shown only for plans
> that use Partition-Wise-Join.
>
> For all the queries that use only PA plans but not PWJ plans, I got
> the exact same plan for HEAD as for PA+PWJ patch, except that for the
> later, the Append is a ParallelAppend. Whereas, for you, the plans
> have join-order changed.
>
> Regarding actual costs; consequtively, for me the actual-cost are more
> or less the same for HEAD and PA+PWJ. Whereas, for your runs, you have
> quite different costs naturally because the plans themselves are
> different on head versus PA+PWJ.
>
> My PA+PWJ plan outputs (and actual costs) match exactly what you get
> with PA+PWJ patch. But like I said, I get the same join order and same
> plans (and actual costs) for HEAD as well (except
> ParallelAppend=>Append).
>
> May be, if you have the latest HEAD code with your setup, you can
> yourself check some of the queries again to see if they are still
> seeing higher costs as compared to PA ? I suspect that some changes in
> latest code might be causing this discrepancy; because when I tested
> some of the explains with a HEAD-branch server running with your
> database, I got results matching PA figures.
>
> Attached is my explain-analyze outputs.
>

Strange. Please let me know what was the commit-id you were
experimenting on. I think we need to investigate this a little
further. Additionally I want to point that I also applied patch [1],
which I forgot to mention before.

[1]  
https://www.postgresql.org/message-id/CAEepm%3D3%3DNHHko3oOzpik%2BggLy17AO%2Bpx3rGYrg3x_x05%2BBr9-A%40mail.gmail.com

> On 16 August 2017 at 18:34, Robert Haas  wrote:
>> Thanks for the benchmarking results!
>>
>> On Tue, Aug 15, 2017 at 11:35 PM, Rafia Sabih
>>  wrote:
>>> Q4 | 244 | 12 | PA and PWJ, time by only PWJ - 41
>>
>> 12 seconds instead of 244?  Whoa.  I find it curious that we picked a
>> Parallel Append with a bunch of non-partial plans when we could've
>> just as easily picked partial plans, or so it seems to me.  To put
>> that another way, why did we end up with a bunch of Bitmap Heap Scans
>> here instead of Parallel Bitmap Heap Scans?
>
> Actually, the cost difference would be quite low for Parallel Append
> with partial plans and Parallel Append with non-partial plans with 2
> workers. But yes, I should take a look at why it is consistently
> taking non-partial Bitmap Heap Scan.
>
> 
>
>> Q6 | 29 | 12 | PA only
>
> This one needs to be analysed, because here, the plan cost is the
> same, but actual cost for PA is almost half the cost for HEAD. This is
> the same observation for my run also.
>
> Thanks
> -Amit



-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-09-07 Thread Alvaro Herrera
Peter Eisentraut wrote:
> On 9/4/17 06:03, Alvaro Herrera wrote:
> > Tom Lane wrote:
> >> Michael Paquier  writes:
> >>> I don't like breaking the abstraction of pg_log() with the existing
> >>> flags with some kind of pg_debug() layer. The set of APIs present now
> >>> in pg_rewind for logging is nice to have, and I think that those debug
> >>> messages should be translated. So what about the attached?
> >>
> >> Your point about INT64_FORMAT not necessarily working with fprintf
> >> is an outstanding reason not to keep it like it is.  I've not reviewed
> >> this patch in detail but I think this is basically the way to fix it.
> > 
> > Actually this code goes throgh vsnprintf, not fprintf, which should be
> > safe, so I removed that part of the comment, and pushed.
> 
> Is there a reason this was not backpatched to 9.5?

No, I'll backpatch later.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-09-07 Thread Michael Paquier
On Thu, Sep 7, 2017 at 9:46 AM, Bossart, Nathan  wrote:
> I've attached v1 of this patch.  I think we might want to refactor the
> code for retrieving the relation name from a RangeVar, but it would
> probably be better to do that in a separate patch.

Using the patch checking for duplicate columns:
=# create table aa (a int);
CREATE TABLE
=# vacuum ANALYZE aa(z, z);
ERROR:  0A000: column lists cannot have duplicate entries
HINT:  the column list specified for relation "aa" contains duplicates
LOCATION:  check_column_lists, vacuum.c:619
Shouldn't the priority be given to undefined columns instead of
duplicates? You may want to add a test for that as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making clausesel.c Smarter

2017-09-07 Thread Daniel Gustafsson

> On 06 Sep 2017, at 07:13, David Rowley  wrote:
> 
> On 6 September 2017 at 00:43, Daniel Gustafsson  wrote:
>> This patch was moved to the currently open Commitfest.  Given the above
>> comment, is the last patch in this thread still up for review, or are you
>> working on a new version?  Just to avoid anyone reviewing an outdated 
>> version.
> 
> Hi Daniel,
> 
> I've not had time to work on a new version yet and I can't imagine
> that I will for quite some time.
> 
> The idea I had to remove the quadratic search on the list was to add
> to or modify equal() in equalfuncs.c to have it return -1, 0, +1 and
> use that as a comparison function in a binary search tree. The Btree
> would complement List as a way of storing Nodes in no particular
> order, but in a way that a Node can be found quickly. There's likely
> more than a handful of places in the planner that would benefit from
> this already.
> 
> It's not a 5-minute patch and it probably would see some resistance,
> so won't have time to look at this soon.
> 
> If the possibility of this increasing planning time in corner cases is
> going to be a problem, then it might be best to return this with
> feedback for now and I'll resubmit if I get time later in the cycle.

For now I’ve marked this “Waiting on Author” awaiting a new patch for the
community to consider.  If there is no time to hack on this in the current CF
(which seems likely given your mail above) I’ll move it to the next CF as this
one draws to a close.

cheers ./daniel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: pg_rewind to skip config files

2017-09-07 Thread Chris Travers
One more side to this which is relevant to other discussions.

If I am rewinding back to before when a table was created, the current
algorithm as well as any proposed algorithms will delete the reference to
the relfilenode in the catalogs but not the file itself.  I don't see how
an undo subsystem would fix this.

Is this a reason to rethink the idea that maybe a pg_fsck utility might be
useful that could be run immediately after a rewind?

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-09-07 Thread Fabien COELHO



Very very minor comments that I should have noticed before, sorry for this
additional round trip.


Thank you for the dedicated review!


I'm someone at times pigheaded, I think in the good sense if it is 
possible, and I like to finish what I start:-)


Patch applies, compiles, works, everything is fine from my point of view.

I switched it to "Ready for Committer".

Again, if the pgbench tap test patch get through, it should be tap tested.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CLUSTER command progress monitor

2017-09-07 Thread Tatsuro Yamada

On 2017/09/06 16:11, Michael Paquier wrote:

On Wed, Sep 6, 2017 at 3:58 PM, Tatsuro Yamada
 wrote:

I revised the patch like this:


You should avoid top-posting.


I see.


I didn't change the name of view (pg_stat_progress_cluster) because I'm not
sure
whether the new name (pg_stat_progress_reorg) is suitable or not.


Here are some ideas: rewrite (incorrect for ALTER TABLE), organize
(somewhat fine), order, operate (too general?), bundle, reform,
assemble.


Thanks for sharing your ideas.
I searched the words like a "reform table", "reassemble table" and "reorganize 
table"
on google. I realized "reorganaize table" is good choice than others
because many DBMS uses this keyword. Therefore, I'll change the name to it like 
this:

before
  pg_stat_progress_cluster

after
  pg_stat_progress_reorg (I abbreviate reorganize to reorg.)

Does anyone have any suggestions?
I'll revise the patch.

Regards,
Tatsuro Yamada




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list

2017-09-07 Thread Anthony Bykov
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

I'm afraid this patch conflicts with current master branch. 

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-07 Thread Michael Paquier
On Thu, Sep 7, 2017 at 2:40 PM, Amit Langote
 wrote:
> On 2017/09/07 13:09, Michael Paquier wrote:
>> pd_lower should remain at 0 on pre-10 servers.
>
> Doesn't PageInit(), which is where any page gets initialized, has always
> set pd_lower to SizeOfPageHeaderData?

Yes, sorry. I had a mind slippery here..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers