PGOPTIONS="-fh" make check gets stuck since Postgres 11

2019-07-07 Thread Michael Paquier
Hi all,

I have begun playing with regressplans.sh which enforces various
combinations of "-f s|i|n|m|h" when running the regression tests, and
I have noticed that -fh can cause the server to become stuck in the
test join_hash.sql with this query (not sure which portion of the SET
LOCAL parameters are involved) :
select count(*) from simple r join extremely_skewed s using (id);

This does not happen with REL_10_STABLE where the test executes
immediately, so we has visibly an issue caused by v11 here.

Any thoughts?
--
Michael


signature.asc
Description: PGP signature


Re: proposal - patch: psql - sort_by_size

2019-07-07 Thread Pavel Stehule
Hi

po 8. 7. 2019 v 6:12 odesílatel Thomas Munro 
napsal:

> On Sun, Jun 30, 2019 at 8:48 PM Pavel Stehule 
> wrote:
> > I used this text in today patch
>
> Hi Pavel,
>
> Could you please post a rebased patch?
>

rebased patch attached

Regards

Pavel

>
> Thanks,
>
> --
> Thomas Munro
> https://enterprisedb.com
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7789fc6177..96fc4f489a 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3973,6 +3973,17 @@ bar
 
   
 
+  
+SORT_BY_SIZE
+
+
+Setting this variable to on, sorts
+\d*+, \db+, \l+
+and \dP*+ outputs by size (when size is displayed).
+
+
+  
+
   
SQLSTATE

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 8b4cd53631..ae79cbb312 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -223,6 +223,7 @@ describeTablespaces(const char *pattern, bool verbose)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
+	const char *sizefunc = NULL;
 
 	if (pset.sversion < 8)
 	{
@@ -265,9 +266,12 @@ describeTablespaces(const char *pattern, bool verbose)
 		  gettext_noop("Options"));
 
 	if (verbose && pset.sversion >= 90200)
+	{
 		appendPQExpBuffer(,
 		  ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_tablespace_size(oid)) AS \"%s\"",
 		  gettext_noop("Size"));
+		sizefunc = "pg_catalog.pg_tablespace_size(oid)";
+	}
 
 	if (verbose && pset.sversion >= 80200)
 		appendPQExpBuffer(,
@@ -281,7 +285,10 @@ describeTablespaces(const char *pattern, bool verbose)
 		  NULL, "spcname", NULL,
 		  NULL);
 
-	appendPQExpBufferStr(, "ORDER BY 1;");
+	if (pset.sort_by_size && sizefunc)
+		appendPQExpBuffer(, "ORDER BY %s DESC;", sizefunc);
+	else
+		appendPQExpBufferStr(, "ORDER BY 1;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
@@ -863,6 +870,7 @@ listAllDbs(const char *pattern, bool verbose)
 	PGresult   *res;
 	PQExpBufferData buf;
 	printQueryOpt myopt = pset.popt;
+	const char *sizefunc = NULL;
 
 	initPQExpBuffer();
 
@@ -882,12 +890,15 @@ listAllDbs(const char *pattern, bool verbose)
 	appendPQExpBufferStr(, "   ");
 	printACLColumn(, "d.datacl");
 	if (verbose && pset.sversion >= 80200)
+	{
 		appendPQExpBuffer(,
 		  ",\n   CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')\n"
 		  "THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname))\n"
 		  "ELSE 'No Access'\n"
 		  "   END as \"%s\"",
 		  gettext_noop("Size"));
+		sizefunc = "pg_catalog.pg_database_size(d.datname)";
+	}
 	if (verbose && pset.sversion >= 8)
 		appendPQExpBuffer(,
 		  ",\n   t.spcname as \"%s\"",
@@ -906,7 +917,10 @@ listAllDbs(const char *pattern, bool verbose)
 		processSQLNamePattern(pset.db, , pattern, false, false,
 			  NULL, "d.datname", NULL, NULL);
 
-	appendPQExpBufferStr(, "ORDER BY 1;");
+	if (pset.sort_by_size && sizefunc)
+		appendPQExpBuffer(, "ORDER BY %s DESC;", sizefunc);
+	else
+		appendPQExpBufferStr(, "ORDER BY 1;");
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
 	if (!res)
@@ -3628,6 +3642,7 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	bool		showMatViews = strchr(tabtypes, 'm') != NULL;
 	bool		showSeq = strchr(tabtypes, 's') != NULL;
 	bool		showForeign = strchr(tabtypes, 'E') != NULL;
+	const char *sizefunc = NULL;
 
 	PQExpBufferData buf;
 	PGresult   *res;
@@ -3711,13 +3726,19 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 		 * size of a table, including FSM, VM and TOAST tables.
 		 */
 		if (pset.sversion >= 9)
+		{
 			appendPQExpBuffer(,
 			  ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \"%s\"",
 			  gettext_noop("Size"));
+			sizefunc = "pg_catalog.pg_table_size(c.oid)";
+		}
 		else if (pset.sversion >= 80100)
+		{
 			appendPQExpBuffer(,
 			  ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid)) as \"%s\"",
 			  gettext_noop("Size"));
+			sizefunc = "pg_catalog.pg_relation_size(c.oid)";
+		}
 
 		appendPQExpBuffer(,
 		  ",\n  pg_catalog.obj_description(c.oid, 'pg_class') as \"%s\"",
@@ -3770,7 +3791,10 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 		  "n.nspname", "c.relname", NULL,
 		  "pg_catalog.pg_table_is_visible(c.oid)");
 
-	appendPQExpBufferStr(, "ORDER BY 1,2;");
+	if (pset.sort_by_size && sizefunc)
+		appendPQExpBuffer(, "ORDER BY %s DESC;", sizefunc);
+	else
+		appendPQExpBufferStr(, "ORDER BY 1,2;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
@@ -3946,6 +3970,7 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
  "   JOIN d ON i.inhparent = d.oid)\n"
  "SELECT 

Re: Declared but no defined functions

2019-07-07 Thread Masahiko Sawada
On Sun, Jul 7, 2019 at 10:04 AM Michael Paquier  wrote:
>
> On Sun, Jul 07, 2019 at 07:31:12AM +0800, Masahiko Sawada wrote:
> > Attached patch removes these functions.
>
> Thanks, applied.

Thank you!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center




Re: [HACKERS] [PATCH] Generic type subscripting

2019-07-07 Thread Thomas Munro
On Fri, Jun 7, 2019 at 6:22 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > > >> Rebase after pg_indent. Besides, off the list there was a suggestion 
> > > >> that this
> > > >> could be useful to accept more than one data type as a key for 
> > > >> subscripting.
> > > >> E.g. for jsonb it probably makes sense to understand both a simple key 
> > > >> name and
> > > >> jsonpath:
> >
> > And one more rebase.
>
> Oh, looks like I was just confused and it wasn't necessary - for some reason
> starting from v22 cfbot tries to apply v6 instead of the latest one.

Hi Dmitry,

Sorry about that.  It looks like I broke the cfbot code that picks
which thread to pull patches from when there are several registered in
the CF app, the last time the HTML format changed.  Now it's back to
picking whichever thread has the most recent message on it.  Such are
the joys of web scraping (obviously we need better integration and
that will happen, I just haven't had time yet).

Anyway, I fixed that.  But now you really do need to rebase :-)

-- 
Thomas Munro
https://enterprisedb.com




Re: fix for BUG #3720: wrong results at using ltree

2019-07-07 Thread Thomas Munro
On Sun, Apr 7, 2019 at 3:46 AM Tom Lane  wrote:
> =?UTF-8?Q?Filip_Rembia=C5=82kowski?=  writes:
> > Here is my attempt to fix a 12-years old ltree bug (which is a todo item).
> > I see it's not backward-compatible, but in my understanding that's
> > what is documented. Previous behavior was inconsistent with
> > documentation (where single asterisk should match zero or more
> > labels).
> > http://archives.postgresql.org/pgsql-bugs/2007-11/msg00044.php

[...]

> In short, I'm wondering if we should treat this as a documentation
> bug not a code bug.  But to do that, we'd need a more accurate
> description of what the code is supposed to do, because the statement
> quoted above is certainly not a match to the actual behavior.

This patch doesn't apply.  More importantly, it seems like we don't
have a consensus on whether we want it.

Teodor, Oleg, would you like to offer an opinion here?  If I
understand correctly, the choices are doc change, code/comment change
or WONT_FIX.  This seems to be an entry that we can bring to a
conclusion in this CF with some input from the ltree experts.

-- 
Thomas Munro
https://enterprisedb.com




Re: Fix typos and inconsistencies for HEAD (take 5)

2019-07-07 Thread Michael Paquier
On Sun, Jul 07, 2019 at 08:03:01AM +0300, Alexander Lakhin wrote:
> 5.8. dictlexize -> thesaurus_lexize

There could be other dictionaries.

> 5.9. regression.diffsregression.planregress/inh -> regression.diffs
> planregress/diffs.inh

I am wondering if we should not just nuke that...  For now I have
included your change as the mistake is obvious, but I am starting a
new thread.  The history around this script does not play in favor of
it:
commit: 2fc80e8e8304913c8dd1090bb2976632c0f4a8c3
author: Bruce Momjian 
date: Wed, 12 Feb 2014 17:29:19 -0500
Rename 'gmake' to 'make' in docs and recommended commands

This simplifies the docs and makes it easier to cut/paste command
lines.

commit: c77e2e42fb4cf5c90a7562b9df289165ff164df1
author: Tom Lane 
date: Mon, 18 Dec 2000 02:45:47 +
Tweak regressplans.sh to use any already-set PGOPTIONS.

And looking closer it seems that there are other issues linked to
it...

> 5.27. equivalentOpersAfterPromotion -> remove (irrelevant since
> 8536c962, but the whole comments is too old to be informational too)

It seems to me that this could be much more reworked.  So discarded
for now.

> 5.29. ExclusiveRowLock -> RowExclusiveLock

Grammar mistake here.

> 5.31. ExecBitmapHeapNext -> BitmapHeapNext

Well, BitmapHeapRecheck is not listed in the interface routines
either..  

> 5.37. ExecSeqNext -> SeqNext
> 5.40. ExecSubqueryNext -> SubqueryNext
> 5.41. ExecValuesNext -> ValuesNext

Here as well these sets are incomplete.  Instead for those series I'd
like to think that it would be better to do a larger cleanup and just
remove all these in the executor "INTERFACE ROUTINES".  Your proposed
patches don't make things better either, as the interfaces are listed
in alphabetical order.

> 5.39. exec_subplan_get_plan -> remove (not used since 1cc29fe7)

This could be used by extensions.  So let's not remove it.

And committed most of the rest.  Thanks.
--
Michael


signature.asc
Description: PGP signature


Re: proposal - patch: psql - sort_by_size

2019-07-07 Thread Thomas Munro
On Sun, Jun 30, 2019 at 8:48 PM Pavel Stehule  wrote:
> I used this text in today patch

Hi Pavel,

Could you please post a rebased patch?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com




Re: Add test case for sslinfo

2019-07-07 Thread Thomas Munro
On Mon, Jul 8, 2019 at 2:59 PM Hao Wu  wrote:
> I see there is no test case for sslinfo. I have added a test case for it in 
> my project.

Hi Hao Wu,

Thanks!  I see that you created a CF entry
https://commitfest.postgresql.org/24/2203/.  While I was scanning
through the current CF looking for trouble, this one popped in front
of my eyes, so here's some quick feedback even though it's in the next
CF:

+#!/bin/bash

I don't think we can require that script interpreter.

This failed[1] with permissions errors:

+cp: cannot create regular file '/server.crt': Permission denied

It looks like that's because the script assumes that PGDATA is set.

I wonder if we want to include more SSL certificates, or if we want to
use the same set of fixed certificates (currently under
src/test/ssl/ssl) for all tests like this.  I don't have a strong
opinion on that, but I wanted to mention that policy decision.  (There
is also a test somewhere that creates a new one on the fly.)

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/76601

-- 
Thomas Munro
https://enterprisedb.com




RE: Run-time pruning for ModifyTable

2019-07-07 Thread Kato, Sho
On Monday, July 8, 2019 11:34 AM, Amit Langote wrote:
> By the way, let's keep any further discussion on this particular topic
> in the other thread.

Thanks for the details. I got it.

Regards,
Kato sho
> -Original Message-
> From: Amit Langote [mailto:amitlangot...@gmail.com]
> Sent: Monday, July 8, 2019 11:34 AM
> To: Kato, Sho/加藤 翔 
> Cc: David Rowley ; PostgreSQL Hackers
> 
> Subject: Re: Run-time pruning for ModifyTable
> 
> Kato-san,
> 
> On Thu, Jul 4, 2019 at 1:40 PM Kato, Sho  wrote:
> > > If I understand the details of [1] correctly, ModifyTable will no
> > > longer have N subplans for N result relations as there are today.
> > > So, it doesn't make sense for ModifyTable to contain
> > > PartitionedRelPruneInfos and for
> ExecInitModifyTable/ExecModifyTable
> > > to have to perform initial and execution-time pruning, respectively.
> >
> > Does this mean that the generic plan will not have N subplans for N
> result relations?
> > I thought [1] would make creating generic plans faster, but is this
> correct?
> 
> Yeah, making a generic plan for UPDATE of inheritance tables will
> certainly become faster, because we will no longer plan the same query
> N times for N child tables.  There will still be N result relations but
> only one sub-plan to fetch the rows from.  Also, planning will still cost
> O(N), but with a much smaller constant factor.
> 
> By the way, let's keep any further discussion on this particular topic
> in the other thread.
> 
> Thanks,
> Amit


Re: allow_system_table_mods stuff

2019-07-07 Thread Bruce Momjian
On Mon, Jun 24, 2019 at 11:20:51AM -0400, Tom Lane wrote:
> I do see value in two switches not one, but it's what I said above,
> to not need to give people *more* chance-to-break-things than they
> had before when doing manual catalog fixes.  That is, we need a
> setting that corresponds more or less to current default behavior.
> 
> There's an aesthetic argument to be had about whether to have two
> bools or one three-way switch, but I prefer the former; there's
> no backward-compatibility issue here since allow_system_table_mods
> couldn't be set by applications anyway.

I like a single three-way switch since if you are allowing DDL, you
probably don't care if you restrict DML.  log_statement already has a
similar distinction with values of none, ddl, mod, all.  I assume
allow_system_table_mods could have value of false, dml, true.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: refactoring - share str2*int64 functions

2019-07-07 Thread Thomas Munro
On Fri, May 24, 2019 at 3:23 AM Fabien COELHO  wrote:
> >> Although I agree it is not worth a lot of trouble, and even if I don't do
> >> Windows, I think it valuable that the behavior is the same on all platform.
> >> The attached match shares pg_str2*int64 functions between frontend and
> >> backend by moving them to "common/", which avoids some code duplication.
> >>
> >> This is more refactoring, and it fixes the behavior change on 32 bit
> >> architectures.
>
> V2 is a rebase.

Hi Fabien,

Here's some semi-automated feedback, noted while going through
failures on cfbot.cputube.org.  You have a stray editor file
src/backend/parser/parse_node.c.~1~.  Something is failing to compile
while doing the temp-install in make check-world, which probably
indicates that some test or contrib module is using the interface you
changed?

-- 
Thomas Munro
https://enterprisedb.com




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2019-07-07 Thread Thomas Munro
On Wed, Jun 19, 2019 at 7:22 PM Paul Guo  wrote:
> I updated the patch to v3. In this version, we skip the error if copydir 
> fails due to missing src/dst directory,
> but to make sure the ignoring is legal, I add a simple log/forget mechanism 
> (Using List) similar to the xlog invalid page
> checking mechanism. Two tap tests are included. One is actually from a 
> previous patch by Kyotaro in this
> email thread and another is added by me. In addition, dbase_desc() is fixed 
> to make the message accurate.

Hello Paul,

FYI t/011_crash_recovery.pl is failing consistently on Travis CI with
this patch applied:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555368907

-- 
Thomas Munro
https://enterprisedb.com




Re: tableam vs. TOAST

2019-07-07 Thread Thomas Munro
On Wed, Jun 12, 2019 at 4:17 AM Robert Haas  wrote:
> On Tue, May 21, 2019 at 2:10 PM Robert Haas  wrote:
> > Updated and rebased patches attached.
>
> And again.

Hi Robert,

Thus spake GCC:

detoast.c: In function ‘toast_fetch_datum’:
detoast.c:308:12: error: variable ‘toasttupDesc’ set but not used
[-Werror=unused-but-set-variable]
TupleDesc toasttupDesc;
^

-- 
Thomas Munro
https://enterprisedb.com




Re: warning to publication created and wal_level is not set to logical

2019-07-07 Thread Thomas Munro
On Wed, Mar 27, 2019 at 1:36 AM Lucas Viecelli  wrote:
>> Oh, OK, then this seems like it's basically covered already.  I think
>> the original suggestion to add a WARNING during CREATE PUBLICATION
>> isn't unreasonable.  But we don't need to do more than that (and it
>> shouldn't be higher than WARNING).
>
> Okay, I think it will improve understanding of new users.
>
> Since everything is fine, thank you all for the comments

Hi Lucas,

The July Commitfest has started.  This patch is in "Needs review"
status, but it doesn't apply.  If I read the above discussion
correctly, it seems there is agreement that a warning here is a good
idea to commit this patch.  Could you please post a rebased patch?

A note on the message:

WARNING: `PUBLICATION` created but wal_level `is` not set to logical,
you need to change it before creating any SUBSCRIPTION

I wonder if it would be more typical project style to put the clue on
what to do into an "errhint" message, something like this:

WARNING: insufficient wal_level to publish logical changes
HINT:  Set wal_level to logical before creating subscriptions.

-- 
Thomas Munro
https://enterprisedb.com




Add test case for sslinfo

2019-07-07 Thread Hao Wu
Hi Hackers,

I see there is no test case for sslinfo. I have added a test case for it in
my project.
Do you mind if I apply this test case to postgresql?

Best regards,
Hao Wu


0001-Add-certificates-keys-and-test-cases-for-contrib-ssl.patch
Description: Binary data


Re: Improve search for missing parent downlinks in amcheck

2019-07-07 Thread Thomas Munro
On Wed, May 1, 2019 at 12:58 PM Peter Geoghegan  wrote:
> On Sun, Apr 28, 2019 at 10:15 AM Alexander Korotkov
>  wrote:
> > I think this definitely not bug fix.  Bloom filter was designed to be
> > lossy, no way blaming it for that :)
>
> I will think about a simple fix, but after the upcoming point release.
> There is no hurry.

A bureaucratic question:  What should the status be for this CF entry?

-- 
Thomas Munro
https://enterprisedb.com




Re: FETCH FIRST clause PERCENT option

2019-07-07 Thread Ryan Lambert
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

The basic functionality works as I expect.  In the following example I would 
have guessed it would return 4 rows instead of 5.  I don't mind that it uses 
ceil here, but think that deserves a mention in the documentation.  

CREATE TABLE r100 (id INT);
INSERT INTO r100 SELECT generate_series(1, 100);

SELECT * FROM r100 FETCH FIRST 4.01 PERCENT ROWS ONLY;
 id

  1
  2
  3
  4
  5
(5 rows)

There's a missing space between the period and following sentence in 
src\backend\executor\nodeLimit.c
"previous time we got a different result.In PERCENTAGE option there are"


There's a missing space and the beginning "w" should be capitalized in 
doc\src\sgml\ref\select.sgml
with PERCENT count specifies the maximum number of rows 
to return 
in percentage.ROW

Another missing space after the period.

previous time we got a different result.In PERCENTAGE option there are"


Ryan Lambert

The new status of this patch is: Waiting on Author


Re: Run-time pruning for ModifyTable

2019-07-07 Thread Amit Langote
Kato-san,

On Thu, Jul 4, 2019 at 1:40 PM Kato, Sho  wrote:
> > If I understand the details of [1] correctly, ModifyTable will no longer
> > have N subplans for N result relations as there are today.  So, it doesn't
> > make sense for ModifyTable to contain PartitionedRelPruneInfos and for
> > ExecInitModifyTable/ExecModifyTable
> > to have to perform initial and execution-time pruning, respectively.
>
> Does this mean that the generic plan will not have N subplans for N result 
> relations?
> I thought [1] would make creating generic plans faster, but is this correct?

Yeah, making a generic plan for UPDATE of inheritance tables will
certainly become faster, because we will no longer plan the same query
N times for N child tables.  There will still be N result relations
but only one sub-plan to fetch the rows from.  Also, planning will
still cost O(N), but with a much smaller constant factor.

By the way, let's keep any further discussion on this particular topic
in the other thread.

Thanks,
Amit




Re: Another way to fix inherited UPDATE/DELETE

2019-07-07 Thread Amit Langote
On Wed, Jul 3, 2019 at 10:50 PM Tom Lane  wrote:
>
> Amit Langote  writes:
> > On Wed, Feb 20, 2019 at 6:49 AM Tom Lane  wrote:
> >> Obviously this'd be a major rewrite with no chance of making it into v12,
> >> but it doesn't sound too big to get done during v13.
>
> > Are you planning to work on this?
>
> It's on my list, but so are a lot of other things.  If you'd like to
> work on it, feel free.

Thanks for the reply.  Let me see if I can get something done for the
September CF.

Regards,
Amit




Re: [RFC] [PATCH] Flexible "partition pruning" hook

2019-07-07 Thread Thomas Munro
On Sat, Apr 6, 2019 at 3:06 PM Andres Freund  wrote:
> I've moved this to the next CF, and marked it as targeting v13.

Hi Mike,

Commifest 1 for PostgreSQL 13 is here.  I was just going through the
automated build results for the 'fest and noticed that your patch
causes a segfault in the regression tests (possibly due to other
changes that have been made in master since February).  You can see
the complete backtrace on the second link below, but it looks like
this is happening every time, so hopefully not hard to track down
locally.

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46412
https://travis-ci.org/postgresql-cfbot/postgresql/builds/555380617

Thanks,

-- 
Thomas Munro
https://enterprisedb.com




Re: FETCH FIRST clause PERCENT option

2019-07-07 Thread Thomas Munro
On Thu, Jun 27, 2019 at 9:06 PM Surafel Temesgen  wrote:
> The attached patch include the fix for all the comment given

Hi Surafel,

There's a call to adjust_limit_rows_costs() hiding under
contrib/postgres_fdw, so this fails check-world.

-- 
Thomas Munro
https://enterprisedb.com




Re: Optimze usage of immutable functions as relation

2019-07-07 Thread Thomas Munro
On Thu, Mar 21, 2019 at 5:58 AM Alexander Kuzmenkov
 wrote:
> On 11/16/18 22:03, Tom Lane wrote:
> > A possible fix for this is to do eval_const_expressions() on
> > function RTE expressions at this stage (and then not need to
> > do it later), and then pull up only when we find that the
> > RTE expression has been reduced to a single Const.
>
>
> Attached is a patch that does this, and transforms RTE_FUCTION that was
> reduced to a single Const into an RTE_RESULT.

Hi Alexander,

The July Commitfest is here.  Could we please have a rebase of this patch?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com




Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3

2019-07-07 Thread Thomas Munro
On Sat, Jul 6, 2019 at 2:42 AM Antonin Houska  wrote:
> I've reworked the way key is passed to postmaster (v04-0003-...) so it's
> easier to call the encryption key command interactively, adjusted pg_upgrade
> so that it preserves database, tablespace and relfilenode OIDs (v04-0014-...),
> reorganized the code a bit and split the code into more diffs.

Hi Antonin,

Some robotic feedback:

1.  On Linux, an assertion failed:

ExceptionalCondition (conditionName=conditionName@entry=0x973891
"!(decrypt_p == p)", errorType=errorType@entry=0x928d7d
"FailedAssertion", fileName=fileName@entry=0x973827 "xlogutils.c",
lineNumber=lineNumber@entry=815) at assert.c:54

See full stack here:
https://travis-ci.org/postgresql-cfbot/postgresql/builds/50833

2.  On Windows the build failed:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46469

-- 
Thomas Munro
https://enterprisedb.com




Re: Duplicated LSN in ReorderBuffer

2019-07-07 Thread Thomas Munro
On Wed, Jun 26, 2019 at 2:46 AM Ildar Musin  wrote:
> Attached is a simple patch that uses subxid instead of top-level xid
> in ReorderBufferAddNewTupleCids() call. It seems to fix the bug, but
> i'm not sure that this is a valid change. Can someone please verify it
> or maybe suggest a better solution for the issue?

Hello Ildar,

I hope someone more familiar with this code than me can comment, but
while going through the Commitfest CI results I saw this segfault with
your patch:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555184304

At a glance, HistoricSnapshotGetTupleCids() returned NULL in
HeapTupleSatisfiesHistoricMVCC(), so ResolveCminCmaxDuringDecoding()
blew up.

-- 
Thomas Munro
https://enterprisedb.com




Re: Built-in connection pooler

2019-07-07 Thread Thomas Munro
On Tue, Jul 2, 2019 at 3:11 AM Konstantin Knizhnik
 wrote:
> On 01.07.2019 12:57, Thomas Munro wrote:
> > Interesting work.  No longer applies -- please rebase.
> >
> Rebased version of the patch is attached.
> Also all this version of built-ni proxy can be found in conn_proxy
> branch of https://github.com/postgrespro/postgresql.builtin_pool.git

Thanks Konstantin.  I haven't looked at the code, but I can't help
noticing that this CF entry and the autoprepare one are both features
that come up again an again on feature request lists I've seen.
That's very cool.  They also both need architectural-level review.
With my Commitfest manager hat on: reviewing other stuff would help
with that; if you're looking for something of similar complexity and
also the same level of
everyone-knows-we-need-to-fix-this-!@#$-we-just-don't-know-exactly-how-yet
factor, I hope you get time to provide some more feedback on Takeshi
Ideriha's work on shared caches, which doesn't seem a million miles
from some of the things you're working on.

Could you please fix these compiler warnings so we can see this
running check-world on CI?

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46324
https://travis-ci.org/postgresql-cfbot/postgresql/builds/555180678

-- 
Thomas Munro
https://enterprisedb.com




Re: Bloom index cost model seems to be wrong

2019-07-07 Thread Thomas Munro
On Fri, Mar 1, 2019 at 7:11 AM Jeff Janes  wrote:
> I'm adding it to the commitfest targeting v13.  I'm more interested in 
> feedback on the conceptual issues rather than stylistic ones, as I would 
> probably merge the two functions together before proposing something to 
> actually be committed.

>From the trivialities department, this patch shows up as a CI failure
with -Werror, because there is no declaration for
genericcostestimate2().  I realise that's just a temporary name in a
WIP patch anyway so this isn't useful feedback, but for the benefit of
anyone going through CI failures in bulk looking for things to
complain about: this isn't a real one.

-- 
Thomas Munro
https://enterprisedb.com




Re: [HACKERS] Cached plans and statement generalization

2019-07-07 Thread Thomas Munro
On Tue, Jul 2, 2019 at 3:29 AM Konstantin Knizhnik
 wrote:
> Attached please find rebased version of the patch.
> Also this version can be found in autoprepare branch of this repository
> https://github.com/postgrespro/postgresql.builtin_pool.git
> on github.

Thanks.  I haven't looked at the code but this seems like interesting
work and I hope it will get some review.  I guess this is bound to use
a lot of memory.  I guess we'd eventually want to figure out how to
share the autoprepared plan cache between sessions, which is obviously
a whole can of worms.

A couple of trivial comments with my CF manager hat on:

1.  Can you please fix the documentation?  It doesn't build.
Obviously reviewing the goals, design and implementation are more
important than the documentation at this point, but if that is fixed
then the CF bot will be able to run check-world every day and we might
learn something about the code.
2.  Accidental editor junk included:  src/include/catalog/pg_proc.dat.~1~


--
Thomas Munro
https://enterprisedb.com




Re: Extending PostgreSQL with a Domain-Specific Language (DSL) - Development

2019-07-07 Thread Tom Mercha
On 06/07/2019 00:06, Tomas Vondra wrote:
> First of all, it's pretty difficult to follow the discussion when it's
> not clear what's the original message and what's the response. E-mail
> clients generally indent the original message with '>' or someting like
> that, but your client does not do that (which is pretty silly). And
> copying the message at the top does not really help. Please do something
> about that.

I would like to apologise. I did not realize that my client was doing 
that and now I have changed the client. I hope it's fine now.

> 
> On Fri, Jul 05, 2019 at 09:37:03PM +, Tom Mercha wrote:
>>> I might be missing something, but it seems like you intend to replace
>>> the SQL grammar we have with something else. It's not clear to me what
>>> would be the point of doing that, and it definitely looks like a huge
>>> amount of work - e.g. we don't have any support for switching between
>>> two distinct grammars the way you envision, and just that alone seems
>>> like a multi-year project. And if you don't have that capability, all
>>> external tools kinda stop working. Good luck with running such database.
>>
>> I was considering having two distinct grammars as an option - thanks
>> for indicating the effort involved. At the end of the day I want both
>> my DSL and the PostgreSQL grammars to coexist. Is extending
>> PostgreSQL's grammar with my own through the PostgreSQL extension
>> infrastructure worth consideration or is it also difficult to develop?
>> Could you suggest any reference material on this topic?
>>
> 
> Well, I'm not an expert in that area, but we currently don't have any
> infrastructure to support that. It's a topic that was discussed in the
> past (perhaps you can find some references in the archives) and it
> generally boils down to:
> 
> 1) We're using bison as parser generator.
> 2) Bison does not allow adding rules on the fly.
> 
> So you have to modify the in-core src/backend/parser/gram.y and rebuild
> postgres. See for example for an example of such discussion
> 
> https://www.postgresql.org/message-id/flat/CABSN6VeeEhwb0HrjOCp9kHaWm0Ljbnko5y-0NKsT_%3D5i5C2jog%40mail.gmail.com
>  
> 
> 
> When two of the smartest people on the list say it's a hard problem, it
> probably is. Particularly for someone who does not know the internals.
You are right. Thanks for bringing it to my attention!

I didn't design my language for interaction with triggers and whatnot, 
but I think that it would be very interesting to support those as well, 
so looking at CREATE LANGUAGE functionality is actually exciting and 
appropriate once I make some changes in design. Thanks again for this point!

I hope this is not off topic but I was wondering if you know what are 
the intrinsic differences between HANDLER and INLINE parameters of 
CREATE LANGUAGE? I know that they are functions which are invoked at 
different instances of time (e.g. one is for handling anonymous code 
blocks), but at the end of the day they seem to have the same purpose?

>>> What I'd look at first is implementing the grammar as a procedural
>>> language (think PL/pgSQL, pl/perl etc.) implementing whatever you
>>> expect from your DSL. And it's not like you'd have to wrap everything
>>> in functions, because we have anonymous DO blocks.
>>
>> Thanks for pointing out this direction! I think I will indeed adopt
>> this approach especially if directly extending PostgreSQL grammar would
>> be difficult.
> 
> Well, it's the only way to deal with it at the moment.
> 
> 
> regards
> 


Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2019-07-07 Thread Thomas Munro
On Mon, Feb 18, 2019 at 7:39 PM Hubert Zhang  wrote:
> Based on the assumption we use smgr as hook position, hook API option1 or 
> option2 which is better?
> Or we could find some balanced API between option1 and option2?
>
> Again comments on other better hook positions are also appreciated!

Hi Hubert,

The July Commitfest is now running, and this entry is in "needs
review" state.  Could you please post a rebased patch?

I have questions about how disk quotas should work and I think we'll
probably eventually want more hooks than these, but simply adding
these hooks so extensions can do whatever they want doesn't seem very
risky for core.  I think it's highly likely that the hook signatures
will have to change in future releases too, but that seems OK for such
detailed internal hooks.  As for your question, my first reaction was
that I preferred your option 1, because SMgrRelation seems quite
private and there are no existing examples of that object being
exposed to extensions.  But on reflection, other callbacks don't take
such a mollycoddling approach.  So my vote is for option 2 "just pass
all the arguments to the callback", which I understand to be the
approach of patch you have posted.

+if (smgrcreate_hook)
+{
+(*smgrcreate_hook)(reln, forknum, isRedo);
+}

Usually we don't use curlies for single line if branches.

-- 
Thomas Munro
https://enterprisedb.com




Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

2019-07-07 Thread Thomas Munro
On Mon, Jul 8, 2019 at 9:51 AM Thomas Munro  wrote:
> I noticed that check-world passed several times with this patch
> applied, but the most recent CI run failed in multiple-cic:
>
> +error in steps s2i s1i: ERROR:  cache lookup failed for index 26303
>
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/555472214

And in another run, this time on Windows, create_index failed:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46455

-- 
Thomas Munro
https://enterprisedb.com




Re: dropdb --force

2019-07-07 Thread Thomas Munro
On Thu, Jun 27, 2019 at 7:15 AM Pavel Stehule  wrote:
> fixed

Hi Pavel,

FYI t/050_dropdb.pl fails consistently with this patch applied:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555234838


-- 
Thomas Munro
https://enterprisedb.com




Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?

2019-07-07 Thread Thomas Munro
On Wed, Jun 12, 2019 at 1:21 AM Daniel Gustafsson  wrote:
> The patch is still rough around the edges (TODO’s left to mark some areas), 
> but
> I prefer to get some feedback early rather than working too far in potentially
> the wrong direction, so parking this in the CF for now.

Hi Daniel,

Given the above disclaimers the following may be entirely expected,
but just in case you weren't aware:
t/010_logical_decoding_timelines.pl fails with this patch applied.

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555205042

-- 
Thomas Munro
https://enterprisedb.com




Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

2019-07-07 Thread Thomas Munro
On Sun, Jun 30, 2019 at 7:30 PM Goel, Dhruv  wrote:
> > On Jun 10, 2019, at 1:20 PM, Goel, Dhruv  wrote:
> >> On Jun 9, 2019, at 5:33 PM, Tom Lane  wrote:
> >> Andres Freund  writes:
> >>> On June 9, 2019 8:36:37 AM PDT, Tom Lane  wrote:
>  I think you are mistaken that doing transactional updates in pg_index
>  is OK.  If memory serves, we rely on xmin of the pg_index row for
>  purposes such as detecting whether a concurrently-created index is safe
>  to use yet.
> >
> > I took a deeper look regarding this use case but was unable to find more 
> > evidence. As part of this patch, we essentially make concurrently-created 
> > index safe to use only if transaction started after the xmin of Phase 3. 
> > Even today concurrent indexes can not be used for transactions before this 
> > xmin because of the wait (which I am trying to get rid of in this patch), 
> > is there any other denial of service you are talking about? Both the other 
> > states indislive, indisready can be transactional updates as far as I 
> > understand. Is there anything more I am missing here?
>
> I did some more concurrency testing here through some python scripts which 
> compare the end state of the concurrently created indexes. I also back-ported 
> this patch to PG 9.6 and ran some custom concurrency tests (Inserts, Deletes, 
> and Create Index Concurrently) which seem to succeed. The intermediate states 
> unfortunately are not easy to test in an automated manner, but to be fair 
> concurrent indexes could never be used for older transactions. Do you have 
> more inputs/ideas on this patch?

I noticed that check-world passed several times with this patch
applied, but the most recent CI run failed in multiple-cic:

+error in steps s2i s1i: ERROR:  cache lookup failed for index 26303

https://travis-ci.org/postgresql-cfbot/postgresql/builds/555472214

-- 
Thomas Munro
https://enterprisedb.com




Re: SQL/JSON path issues/questions

2019-07-07 Thread Alexander Korotkov
On Thu, Jul 4, 2019 at 4:38 PM Liudmila Mantrova
 wrote:
> Thank  you!
>
> I think we can make this sentence even shorter, the fix is attached:
>
> "To refer to a JSON element stored at a lower nesting level, add one or
> more accessor operators after @."

Thanks, looks good to me.  Attached revision of patch contains commit
message.  I'm going to commit this on no objections.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0004-clarify-jsonpath-docs-5.patch
Description: Binary data


Re: [PATCH] Incremental sort (was: PoC: Partial sort)

2019-07-07 Thread Tomas Vondra

On Sun, Jul 07, 2019 at 09:01:43AM -0400, James Coleman wrote:

On Sun, Jul 7, 2019 at 8:34 AM Tomas Vondra
 wrote:


On Thu, Jul 04, 2019 at 09:29:49AM -0400, James Coleman wrote:
>On Tue, Jun 25, 2019 at 7:22 PM Tomas Vondra
> wrote:
>>
>> On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote:
>> >
>> >Unrelated: if you or someone else you know that's more familiar with
>> >the parallel code, I'd be interested in their looking at the patch at
>> >some point, because I have a suspicion it might not be operating in
>...
>> So I've looked into that, and the reason seems fairly simple - when
>> generating the Gather Merge paths, we only look at paths that are in
>> partial_pathlist. See generate_gather_paths().
>>
>> And we only have sequential + index paths in partial_pathlist, not
>> incremental sort paths.
>>
>> IMHO we can do two things:
>>
>> 1) modify generate_gather_paths to also consider incremental sort for
>> each sorted path, similarly to what create_ordered_paths does
>>
>> 2) modify build_index_paths to also generate an incremental sort path
>> for each index path
>>
>> IMHO (1) is the right choice here, because it automatically does the
>> trick for all other types of ordered paths, not just index scans. So,
>> something like the attached patch, which gives me plans like this:
>...
>> But I'm not going to claim those are total fixes, it's the minimum I
>> needed to do to make this particular type of plan work.
>
>Thanks for looking into this!
>
>I intended to apply this to my most recent version of the patch (just
>sent a few minutes ago), but when I apply it I noticed that the
>partition_aggregate regression tests have several of these failures:
>
>ERROR:  could not find pathkey item to sort
>
>I haven't had time to look into the cause yet, so I decided to wait
>until the next patch revision.
>

I wanted to investigate this today, but I can't reprodure it. How are
you building and running the regression tests?

Attached is a patch adding the incremental sort below gather merge, and
also tweaking the costing. But that's mostly for and better planning
decisions, I don't get any pathkey errors even with the first patch.


On 12be7f7f997debe4e05e84b69c03ecf7051b1d79 (the last patch I sent,
which is based on top of 5683b34956b4e8da9dccadc2e3a53b86104ebb33), I
did this:

patch -p1 < ~/Downloads/parallel-incremental-sort.patch
 (FWIW I configure with ./configure
--prefix=$HOME/postgresql-test --enable-cassert --enable-debug
--enable-depend CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer
-DOPTIMIZER_DEBUG")
make check-world

And I get the attached regression failures.



OK, thanks. Apparently it's the costing changes that make it go away, if
I try just the patch that tweaks generate_gather_paths() I see the same
failures. The failure happens during plan construction, so I think the
costing changes simply mean the path with incremental sort end up not
being the cheapest one (for the problematic queries), but that's just
pure luck - it's definitely an issue that needs fixing.

That error message is triggered in two places in createplan.c, and after
changing them to Assert(false) I get a core dump with this backtrace:

#0  0x702b3328857f in raise () from /lib64/libc.so.6
#1  0x702b33272895 in abort () from /lib64/libc.so.6
#2  0x00a59a9d in ExceptionalCondition (conditionName=0xc52e84 "!(0)", errorType=0xc51f96 
"FailedAssertion", fileName=0xc51fe6 "createplan.c", lineNumber=5937) at assert.c:54
#3  0x007d4ab5 in prepare_sort_from_pathkeys (lefttree=0x2bbbce0, pathkeys=0x2b7a130, relids=0x0, reqColIdx=0x0, adjust_tlist_in_place=false, p_numsortkeys=0x7ffe1abcfd6c, p_sortColIdx=0x7ffe1abcfd60, p_sortOperators=0x7ffe1abcfd58, p_collations=0x7ffe1abcfd50, 
   p_nullsFirst=0x7ffe1abcfd48) at createplan.c:5937

#4  0x007d4e7f in make_incrementalsort_from_pathkeys 
(lefttree=0x2bbbce0, pathkeys=0x2b7a130, relids=0x0, presortedCols=1) at 
createplan.c:6101
#5  0x007cdd3f in create_incrementalsort_plan (root=0x2b787c0, 
best_path=0x2bb92b0, flags=1) at createplan.c:2019
#6  0x007cb7ad in create_plan_recurse (root=0x2b787c0, 
best_path=0x2bb92b0, flags=1) at createplan.c:469
#7  0x007cd778 in create_gather_merge_plan (root=0x2b787c0, 
best_path=0x2bb94a0) at createplan.c:1764
#8  0x007cb8fb in create_plan_recurse (root=0x2b787c0, 
best_path=0x2bb94a0, flags=4) at createplan.c:516
#9  0x007cdf10 in create_agg_plan (root=0x2b787c0, best_path=0x2bb9b28) 
at createplan.c:2115
#10 0x007cb834 in create_plan_recurse (root=0x2b787c0, 
best_path=0x2bb9b28, flags=3) at createplan.c:484
#11 0x007cdc16 in create_sort_plan (root=0x2b787c0, 
best_path=0x2bba1e8, flags=1) at createplan.c:1986
#12 0x007cb78e in create_plan_recurse (root=0x2b787c0, 
best_path=0x2bba1e8, flags=1) at createplan.c:464
#13 0x007cb4ae in create_plan (root=0x2b787c0, best_path=0x2bba1e8) at 
createplan.c:330
#14 0x007db63c in 

Re: Switching PL/Python to Python 3 by default in PostgreSQL 12

2019-07-07 Thread Steven Pousty
The point of the links I sent from the Python community is that they wanted
Python extinct in the wild as of Jan 1 next year. They are never fixing it,
even for a security vulnerability.

It seems to me we roll out breaking changes with major versions. So yes, if
the user chooses to upgrade to 12 and they haven't migrated their code to
Python 2 it might not work.

I don't have a good answer to no changes except regressions. I do hope,
given how much our users expect us to be secure, that we weigh the
consequences of making our default Python a version which is dead to the
community a month or so after Postgresql 12s release. We can certainly take
the stance of leave the Python version be, but it seems that we should then
come up with a plan if there is a security vulnerability found in Python 2
after Jan 1st 2020.

If Python 2 wasn't our *default* choice then I would be much more
comfortable letting this just pass without mention.

All that aside, I think allowing the admin set the default version of
plpythonu to be an excellent idea.

Thanks
Steve



On Sun, Jul 7, 2019, 8:26 AM Tom Lane  wrote:

> Peter Eisentraut  writes:
> > On 2019-07-07 00:34, Steven Pousty wrote:
> >> Why would it be a 13 or later issue?
>
> > Because PostgreSQL 12 is feature frozen and in beta, and this issue is
> > not a regression.
>
> More to the point: it does not seem to me that we should change what
> "plpythonu" means until Python 2 is effectively extinct in the wild.
> Which is surely some years away yet.  If we change it sooner than
> that, the number of people complaining that we broke perfectly good
> installations will vastly outweigh the number of people who are
> happy because we saved them one keystroke per function definition.
>
> As a possibly relevant comparison, I get the impression that most
> packagers of Python are removing the versionless "python" executable
> name and putting *nothing* in its place.  You have to write python2
> or python3 nowadays.  Individuals might still be setting up symlinks
> so that "python" does what they want, but it's not happening at the
> packaging/distro level.
>
> (This comparison suggests that maybe what we should be thinking
> about is a way to make it easier to change what "plpythonu" means
> at the local-opt-in level.)
>
> regards, tom lane
>


Broken defenses against dropping a partitioning column

2019-07-07 Thread Tom Lane
(Moved from pgsql-bugs thread at [1])

Consider

regression=# create domain d1 as int;
CREATE DOMAIN
regression=# create table t1 (f1 d1) partition by range(f1);
CREATE TABLE
regression=# alter table t1 drop column f1;
ERROR:  cannot drop column named in partition key

So far so good, but that defense has more holes than a hunk of
Swiss cheese:

regression=# drop domain d1 cascade;
psql: NOTICE:  drop cascades to column f1 of table t1
DROP DOMAIN

Of course, the table is now utterly broken, e.g.

regression=# \d t1
psql: ERROR:  cache lookup failed for type 0

(More-likely variants of this include dropping an extension that
defines the type of a partitioning column, or dropping the schema
containing such a type.)

The fix I was speculating about in the pgsql-bugs thread was to add
explicit pg_depend entries making the table's partitioning columns
internally dependent on the whole table (or maybe the other way around;
haven't experimented).  That fix has a couple of problems though:

1. In the example, "drop domain d1 cascade" would automatically
cascade to the whole partitioned table, including child partitions
of course.  This might leave a user sad, if a few terabytes of
valuable data went away; though one could argue that they'd better
have paid more attention to what the cascade cascaded to.

2. It doesn't fix anything for pre-existing tables in pre-v12 branches.


I thought of a different possible approach, which is to move the
"cannot drop column named in partition key" error check from
ATExecDropColumn(), where it is now, to RemoveAttributeById().
That would be back-patchable, but the implication would be that
dropping anything that a partitioning column depends on would be
impossible, even with CASCADE; you'd have to manually drop the
partitioned table first.  Good for data safety, but a horrible
violation of expectations, and likely of the SQL spec as well.
I'm not sure we could avoid order-of-traversal problems, either.


Ideally, perhaps, a DROP CASCADE like this would not cascade to
the whole table but only to the table's partitioned-ness property,
leaving you with a non-partitioned table with most of its data
intact.  It would take a lot of work to make that happen though,
and it certainly wouldn't be back-patchable, and I'm not really
sure it's worth it.

Thoughts?

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/CA%2Bu7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg%40mail.gmail.com




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-07 Thread Peter Eisentraut
On 2019-07-05 22:24, Tomas Vondra wrote:
> What if the granular encryption (not the "whole cluster with a single
> key") case does not encrypt whole blocks, but just tuple data? Would
> that allow at least the most critical WAL use cases (recovery, physical
> replication) to work without having to know all the encryption keys?

Finding the exact point where you divide up sensitive and non-sensitive
data would be difficult.

For example, say, you encrypt the tuple payload but not the tuple
header, so that vacuum would still work.  Then, someone who has access
to the raw data directory could infer in combination with commit
timestamps for example, that on Friday between 5pm and 6pm, 1
records were updated, 500 were inserted, and 200 were deleted, and that
table has about this size, and this happens every Friday, and so on.
That seems way to much information to reveal for an allegedly encrypted
data directory.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [RFC] Removing "magic" oids

2019-07-07 Thread Noah Misch
On Tue, Nov 20, 2018 at 01:20:04AM -0800, Andres Freund wrote:
> On 2018-11-14 21:02:41 -0800, Andres Freund wrote:
> > On 2018-11-15 04:57:28 +, Noah Misch wrote:
> > > On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:
> > > > - one pgbench test tested concurrent insertions into a table with
> > > >   oids, as some sort of stress test for lwlocks and spinlocks. I *think*
> > > >   this doesn't really have to be a system oid column, and this was just
> > > >   because that's how we triggered a bug on some machine. Noah, do I get
> > > >   this right?
> > > 
> > > The point of the test is to exercise OidGenLock by issuing many parallel
> > > GetNewOidWithIndex() and verifying absence of duplicates.  There's nothing
> > > special about OidGenLock, but it is important to use an operation that 
> > > takes a
> > > particular LWLock many times, quickly.  If the test query spends too much 
> > > time
> > > on things other than taking locks, it will catch locking races too rarely.
> > 
> > Sequences ought to do that, too. And if it's borked, we'd hopefully see
> > unique violations. But it's definitely not a 1:1 replacement.

> I've tested this on ppc.  Neither the old version nor the new version
> stress test spinlocks sufficiently to error out with weakened spinlocks
> (not that surprising, there are no spinlocks in any hot path of either
> workload). Both versions very reliably trigger on weakened lwlocks. So I
> think we're comparatively good on that front.

I tested this on xlc, the compiler that motivated the OID test, and the v12+
version of the test didn't catch the bug[1] with xlc 13.1.3.  CREATE TYPE
... AS ENUM generates an OID for each label, so the attached patch makes the
v12+ test have locking behavior similar to its v11 ancestor.

[1] https://postgr.es/m/flat/a72cfcb0-37d0-de2f-b3ec-f38ad8d6a...@postgrespro.ru
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl 
b/src/bin/pgbench/t/001_pgbench_with_server.pl
index dc2c72f..3b097a9 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -58,27 +58,20 @@ sub pgbench
return;
 }
 
-# Test concurrent insertion into table with serial column.  This
-# indirectly exercises LWLock and spinlock concurrency.  This test
-# makes a 5-MiB table.
-
-$node->safe_psql('postgres',
-   'CREATE UNLOGGED TABLE insert_tbl (id serial primary key); ');
-
+# Test concurrent OID generation via pg_enum_oid_index.  This indirectly
+# exercises LWLock and spinlock concurrency.
+my $labels = join ',', map { "'l$_'" } 1 .. 1000;
 pgbench(
'--no-vacuum --client=5 --protocol=prepared --transactions=25',
0,
[qr{processed: 125/125}],
[qr{^$}],
-   'concurrent insert workload',
+   'concurrent OID generation',
{
'001_pgbench_concurrent_insert' =>
- 'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
+ "CREATE TYPE pg_temp.e AS ENUM ($labels); DROP TYPE 
pg_temp.e;"
});
 
-# cleanup
-$node->safe_psql('postgres', 'DROP TABLE insert_tbl;');
-
 # Trigger various connection errors
 pgbench(
'no-such-database',


Re: Switching PL/Python to Python 3 by default in PostgreSQL 12

2019-07-07 Thread Tom Lane
Peter Eisentraut  writes:
> On 2019-07-07 00:34, Steven Pousty wrote:
>> Why would it be a 13 or later issue?

> Because PostgreSQL 12 is feature frozen and in beta, and this issue is
> not a regression.

More to the point: it does not seem to me that we should change what
"plpythonu" means until Python 2 is effectively extinct in the wild.
Which is surely some years away yet.  If we change it sooner than
that, the number of people complaining that we broke perfectly good
installations will vastly outweigh the number of people who are
happy because we saved them one keystroke per function definition.

As a possibly relevant comparison, I get the impression that most
packagers of Python are removing the versionless "python" executable
name and putting *nothing* in its place.  You have to write python2
or python3 nowadays.  Individuals might still be setting up symlinks
so that "python" does what they want, but it's not happening at the
packaging/distro level.

(This comparison suggests that maybe what we should be thinking
about is a way to make it easier to change what "plpythonu" means
at the local-opt-in level.)

regards, tom lane




Re: Switching PL/Python to Python 3 by default in PostgreSQL 12

2019-07-07 Thread Peter Eisentraut
On 2019-07-07 00:34, Steven Pousty wrote:
> Why would it be a 13 or later issue?

Because PostgreSQL 12 is feature frozen and in beta, and this issue is
not a regression.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: (select query)/relation as first class citizen

2019-07-07 Thread Roman Pekar
 Hi,

Yes, I'm thinking about 'query like a view', 'query like a cursor' is
probably possible even now in ms sql server (not sure about postgresql),
but it requires this paradygm shift from set-based thinking to row-by-row
thinking which I'd not want to do.

I completely agree with your points of plan caching and static checks. With
static checks, though it might be possible to do if the query would be
defined as typed, so all the types of the columns is known in advance.
In certain cases having possibility of much better decomposition is might
be more important than having cached plan. Not sure how often these cases
appear in general, but personally for me it'd be awesome to have this
possibility.

Regards,
Roman Pekar

On Sun, 7 Jul 2019 at 15:39, Pavel Stehule  wrote:

> Hi
>
> ne 7. 7. 2019 v 14:54 odesílatel Roman Pekar 
> napsal:
>
>> Hello,
>>
>> Just a bit of background - I currently work as a full-time db developer,
>> mostly with Ms Sql server but I like Postgres a lot, especially because I
>> really program in sql all the time and type system / plpgsql language of
>> Postgres seems to me more suitable for actual programming then t-sql.
>>
>> Here's the problem - current structure of the language doesn't allow to
>> decompose the code well and split calculations and data into different
>> modules.
>>
>> For example. Suppose I have a table employee and I have a function like
>> this (I'll skip definition of return types for the sake of simplicity):
>>
>> create function departments_salary ()
>> returns  table (...)
>> as
>> return $$
>> select department, sum(salary) as salary from employee group by
>> department;
>> $$;
>>
>> so that's fine, but what if I want to run this function on filtered
>> employee? I can adjust the function of course, but it implies I can predict
>> all possible filters I'm going to need in the future.
>> And logically, function itself doesn't have to be run on employee table,
>> anything with department and salary columns will fit.
>> So it'd be nice to be able to define the function like this:
>>
>> create function departments_salary(_employee query)
>> returns table (...)
>> as
>> return $$
>> select department, sum(salary) as salary from _employee group by
>> department;
>> $$;
>>
>> and then call it like this:
>>
>> declare _employee query;
>> ...
>> _poor_employee = (select salary, department from employee where salary <
>> 1000);
>> select * from  departments_salary( _poor_employee);
>>
>> And just to be clear, the query is not really invoked until the last
>> line, so re-assigning _employee variable is more like building query
>> expression.
>>
>> As far as I understand the closest way to do this is to put the data into
>> temporary table and use this temporary table inside of the function. It's
>> not exactly the same of course, cause in case of temporary tables data
>> should be transferred to temporary table, while it will might be filtered
>> later. So it's something like array vs generator in python, or List vs
>> IQueryable in C#.
>>
>> Adding this functionality will allow much better decomposition of the
>> program's logic.
>> What do you think about the idea itself? If you think the idea is worthy,
>> is it even possible to implement it?
>>
>
> If we talk about plpgsql, then I afraid so this idea can disallow plan
> caching - or significantly increase the cost of plan cache.
>
> There are two possibilities of implementation - a) query like cursor -
> unfortunately it effectively disables any optimization and it carry ORM
> performance to procedures. This usage is known performance antipattern, b)
> query like view - it should not to have a performance problems with late
> optimization, but I am not sure about possibility to reuse execution plans.
>
> Currently PLpgSQL is compromise between performance and dynamic (PLpgSQL
> is really static language). Your proposal increase much more dynamic
> behave, but performance can be much more worse.
>
> More - with this behave, there is not possible to do static check - so you
> have to find bugs only at runtime. I afraid about performance of this
> solution.
>
> Regards
>
> Pavel
>
>
>
>> Regards,
>> Roman Pekar
>>
>>
>>


Re: Use relative rpath if possible

2019-07-07 Thread Tom Lane
I wrote:
> Peter Eisentraut  writes:
>> rebased patch attached, no functionality changes

> I poked at this a bit, and soon found that it fails check-world,
> because the isolationtester binary is built with an rpath that
> only works if it's part of the temp install tree, which it ain't.

Oh ... just thought of another issue in the same vein: what about
modules being built out-of-tree with pgxs?  (I'm imagining something
with a libpq.so dependency, like postgres_fdw.)  We probably really
have to keep using the absolute rpath for that, because not only
would such modules certainly fail "make check" with a relative
rpath, but it's not really certain that they're intended to get
installed into the same installdir as the core libraries.

regards, tom lane




Re: (select query)/relation as first class citizen

2019-07-07 Thread Pavel Stehule
Hi

ne 7. 7. 2019 v 14:54 odesílatel Roman Pekar  napsal:

> Hello,
>
> Just a bit of background - I currently work as a full-time db developer,
> mostly with Ms Sql server but I like Postgres a lot, especially because I
> really program in sql all the time and type system / plpgsql language of
> Postgres seems to me more suitable for actual programming then t-sql.
>
> Here's the problem - current structure of the language doesn't allow to
> decompose the code well and split calculations and data into different
> modules.
>
> For example. Suppose I have a table employee and I have a function like
> this (I'll skip definition of return types for the sake of simplicity):
>
> create function departments_salary ()
> returns  table (...)
> as
> return $$
> select department, sum(salary) as salary from employee group by
> department;
> $$;
>
> so that's fine, but what if I want to run this function on filtered
> employee? I can adjust the function of course, but it implies I can predict
> all possible filters I'm going to need in the future.
> And logically, function itself doesn't have to be run on employee table,
> anything with department and salary columns will fit.
> So it'd be nice to be able to define the function like this:
>
> create function departments_salary(_employee query)
> returns table (...)
> as
> return $$
> select department, sum(salary) as salary from _employee group by
> department;
> $$;
>
> and then call it like this:
>
> declare _employee query;
> ...
> _poor_employee = (select salary, department from employee where salary <
> 1000);
> select * from  departments_salary( _poor_employee);
>
> And just to be clear, the query is not really invoked until the last line,
> so re-assigning _employee variable is more like building query expression.
>
> As far as I understand the closest way to do this is to put the data into
> temporary table and use this temporary table inside of the function. It's
> not exactly the same of course, cause in case of temporary tables data
> should be transferred to temporary table, while it will might be filtered
> later. So it's something like array vs generator in python, or List vs
> IQueryable in C#.
>
> Adding this functionality will allow much better decomposition of the
> program's logic.
> What do you think about the idea itself? If you think the idea is worthy,
> is it even possible to implement it?
>

If we talk about plpgsql, then I afraid so this idea can disallow plan
caching - or significantly increase the cost of plan cache.

There are two possibilities of implementation - a) query like cursor -
unfortunately it effectively disables any optimization and it carry ORM
performance to procedures. This usage is known performance antipattern, b)
query like view - it should not to have a performance problems with late
optimization, but I am not sure about possibility to reuse execution plans.

Currently PLpgSQL is compromise between performance and dynamic (PLpgSQL is
really static language). Your proposal increase much more dynamic behave,
but performance can be much more worse.

More - with this behave, there is not possible to do static check - so you
have to find bugs only at runtime. I afraid about performance of this
solution.

Regards

Pavel



> Regards,
> Roman Pekar
>
>
>


Re: [PATCH] Incremental sort (was: PoC: Partial sort)

2019-07-07 Thread James Coleman
On Sun, Jul 7, 2019 at 8:34 AM Tomas Vondra
 wrote:
>
> On Thu, Jul 04, 2019 at 09:29:49AM -0400, James Coleman wrote:
> >On Tue, Jun 25, 2019 at 7:22 PM Tomas Vondra
> > wrote:
> >>
> >> On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote:
> >> >
> >> >Unrelated: if you or someone else you know that's more familiar with
> >> >the parallel code, I'd be interested in their looking at the patch at
> >> >some point, because I have a suspicion it might not be operating in
> >...
> >> So I've looked into that, and the reason seems fairly simple - when
> >> generating the Gather Merge paths, we only look at paths that are in
> >> partial_pathlist. See generate_gather_paths().
> >>
> >> And we only have sequential + index paths in partial_pathlist, not
> >> incremental sort paths.
> >>
> >> IMHO we can do two things:
> >>
> >> 1) modify generate_gather_paths to also consider incremental sort for
> >> each sorted path, similarly to what create_ordered_paths does
> >>
> >> 2) modify build_index_paths to also generate an incremental sort path
> >> for each index path
> >>
> >> IMHO (1) is the right choice here, because it automatically does the
> >> trick for all other types of ordered paths, not just index scans. So,
> >> something like the attached patch, which gives me plans like this:
> >...
> >> But I'm not going to claim those are total fixes, it's the minimum I
> >> needed to do to make this particular type of plan work.
> >
> >Thanks for looking into this!
> >
> >I intended to apply this to my most recent version of the patch (just
> >sent a few minutes ago), but when I apply it I noticed that the
> >partition_aggregate regression tests have several of these failures:
> >
> >ERROR:  could not find pathkey item to sort
> >
> >I haven't had time to look into the cause yet, so I decided to wait
> >until the next patch revision.
> >
>
> I wanted to investigate this today, but I can't reprodure it. How are
> you building and running the regression tests?
>
> Attached is a patch adding the incremental sort below gather merge, and
> also tweaking the costing. But that's mostly for and better planning
> decisions, I don't get any pathkey errors even with the first patch.

On 12be7f7f997debe4e05e84b69c03ecf7051b1d79 (the last patch I sent,
which is based on top of 5683b34956b4e8da9dccadc2e3a53b86104ebb33), I
did this:

patch -p1 < ~/Downloads/parallel-incremental-sort.patch
 (FWIW I configure with ./configure
--prefix=$HOME/postgresql-test --enable-cassert --enable-debug
--enable-depend CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer
-DOPTIMIZER_DEBUG")
make check-world

And I get the attached regression failures.

James Coleman


regression.diffs
Description: Binary data


regression.out
Description: Binary data


(select query)/relation as first class citizen

2019-07-07 Thread Roman Pekar
Hello,

Just a bit of background - I currently work as a full-time db developer,
mostly with Ms Sql server but I like Postgres a lot, especially because I
really program in sql all the time and type system / plpgsql language of
Postgres seems to me more suitable for actual programming then t-sql.

Here's the problem - current structure of the language doesn't allow to
decompose the code well and split calculations and data into different
modules.

For example. Suppose I have a table employee and I have a function like
this (I'll skip definition of return types for the sake of simplicity):

create function departments_salary ()
returns  table (...)
as
return $$
select department, sum(salary) as salary from employee group by
department;
$$;

so that's fine, but what if I want to run this function on filtered
employee? I can adjust the function of course, but it implies I can predict
all possible filters I'm going to need in the future.
And logically, function itself doesn't have to be run on employee table,
anything with department and salary columns will fit.
So it'd be nice to be able to define the function like this:

create function departments_salary(_employee query)
returns table (...)
as
return $$
select department, sum(salary) as salary from _employee group by
department;
$$;

and then call it like this:

declare _employee query;
...
_poor_employee = (select salary, department from employee where salary <
1000);
select * from  departments_salary( _poor_employee);

And just to be clear, the query is not really invoked until the last line,
so re-assigning _employee variable is more like building query expression.

As far as I understand the closest way to do this is to put the data into
temporary table and use this temporary table inside of the function. It's
not exactly the same of course, cause in case of temporary tables data
should be transferred to temporary table, while it will might be filtered
later. So it's something like array vs generator in python, or List vs
IQueryable in C#.

Adding this functionality will allow much better decomposition of the
program's logic.
What do you think about the idea itself? If you think the idea is worthy,
is it even possible to implement it?

Regards,
Roman Pekar


Re: [PATCH] Incremental sort (was: PoC: Partial sort)

2019-07-07 Thread Tomas Vondra

On Thu, Jul 04, 2019 at 09:29:49AM -0400, James Coleman wrote:

On Tue, Jun 25, 2019 at 7:22 PM Tomas Vondra
 wrote:


On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote:
>
>Unrelated: if you or someone else you know that's more familiar with
>the parallel code, I'd be interested in their looking at the patch at
>some point, because I have a suspicion it might not be operating in

...

So I've looked into that, and the reason seems fairly simple - when
generating the Gather Merge paths, we only look at paths that are in
partial_pathlist. See generate_gather_paths().

And we only have sequential + index paths in partial_pathlist, not
incremental sort paths.

IMHO we can do two things:

1) modify generate_gather_paths to also consider incremental sort for
each sorted path, similarly to what create_ordered_paths does

2) modify build_index_paths to also generate an incremental sort path
for each index path

IMHO (1) is the right choice here, because it automatically does the
trick for all other types of ordered paths, not just index scans. So,
something like the attached patch, which gives me plans like this:

...

But I'm not going to claim those are total fixes, it's the minimum I
needed to do to make this particular type of plan work.


Thanks for looking into this!

I intended to apply this to my most recent version of the patch (just
sent a few minutes ago), but when I apply it I noticed that the
partition_aggregate regression tests have several of these failures:

ERROR:  could not find pathkey item to sort

I haven't had time to look into the cause yet, so I decided to wait
until the next patch revision.



I wanted to investigate this today, but I can't reprodure it. How are
you building and running the regression tests?

Attached is a patch adding the incremental sort below gather merge, and
also tweaking the costing. But that's mostly for and better planning
decisions, I don't get any pathkey errors even with the first patch.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 3efc807164..d7bf33f64d 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2719,6 +2719,8 @@ generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, 
bool override_rows)
{
Path   *subpath = (Path *) lfirst(lc);
GatherMergePath *path;
+   boolis_sorted;
+   int presorted_keys;
 
if (subpath->pathkeys == NIL)
continue;
@@ -2727,6 +2729,26 @@ generate_gather_paths(PlannerInfo *root, RelOptInfo 
*rel, bool override_rows)
path = create_gather_merge_path(root, rel, subpath, 
rel->reltarget,

subpath->pathkeys, NULL, rowsp);
add_path(rel, >path);
+
+   /* consider incremental sort */
+   is_sorted = pathkeys_common_contained_in(root->sort_pathkeys,
+   
 subpath->pathkeys, _keys);
+
+   if (!is_sorted && (presorted_keys > 0))
+   {
+   /* Also consider incremental sort. */
+   subpath = (Path *) create_incremental_sort_path(root,
+   
rel,
+   
subpath,
+   
root->sort_pathkeys,
+   
presorted_keys,
+   
-1);
+
+   path = create_gather_merge_path(root, rel, subpath, 
rel->reltarget,
+   
subpath->pathkeys, NULL, rowsp);
+
+   add_path(rel, >path);
+   }
}
 }
 
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 7f820e7351..c6aa17ba67 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1875,16 +1875,8 @@ cost_incremental_sort(Path *path,
   limit_tuples);
 
/* If we have a LIMIT, adjust the number of groups we'll have to 
return. */
-   if (limit_tuples > 0 && limit_tuples < input_tuples)
-   {
-