Re: Inlining of couple of functions in pl_exec.c improves performance

2020-05-29 Thread Amit Khandekar
On Thu, 28 May 2020 at 14:39, Pavel Stehule  wrote:
> I don't see 17% anywhere, but 3-5% is not bad.
Did you see 3-5% only for the pi function, or did you see the same
improvement also for the functions that I wrote ? I was getting a
consistent result of 14-18 % on both of the VMs. Also, is your test
machine running on Windows ? All the machines I tested were on Linux
kernel (Ubuntu)

Below are my results for your pi_est_1() function. For this function,
I am consistently getting 5-9 % improvement. I tested on 3 machines :

gcc : 8.4.0. -O2 option
OS : Ubuntu Bionic

explain analyze select pi_est_1(1000)

1. x86_64 laptop VM (Intel Core i7-8665U)
HEAD :2666 2617 2600 2630
Patched : 2502 2409 2460 2444


2. x86_64 VM (Xeon Gold 6151)
HEAD :1664 1662 1721 1660
Patched : 1541 1548 1537 1526

3. ARM64 VM (Kunpeng)
HEAD :2873 2864 2860 2861
Patched : 2568 2513 2501 2538


>
> patch 0001 has sense and can help with code structure
> patch 0002 it is little bit against simplicity, but for PLpgSQL with blocks 
> structure it is correct.

Here, I moved the exec_stmt code into exec_stmts() function because
exec_stmts() was the only caller, and that function is not that big. I
am assuming you were referring to this point when you said it is a bit
against simplicity. But I didn't get what you implied by "but for
PLpgSQL with blocks structure it is correct"

-- 
Thanks,
-Amit Khandekar
Huawei Technologies




Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Justin Pryzby
On Fri, May 29, 2020 at 09:53:14PM -0400, John Bachir wrote:
> Hi Sergei - I just used the recipe on my production database. I didn't
> observe all the expected benefits, I wonder if there were confounding factors
> or if I did something wrong. If you have time, I'd love to get your feedback.
> Let me know if you need more info. I'd love to write a blog post informing
> the world about this potentially game-changing feature!

If you do it right, you can see a DEBUG:

postgres=# CREATE TABLE tn (i int);
postgres=# ALTER TABLE tn ADD CONSTRAINT nn CHECK (i IS NOT NULL) NOT VALID;
postgres=# ALTER TABLE tn VALIDATE CONSTRAINT nn;
postgres=# SET client_min_messages=debug;
postgres=# ALTER TABLE tn ALTER i SET NOT NULL ;
DEBUG:  existing constraints on column "tn"."i" are sufficient to prove that it 
does not contain nulls

> SLOW (table scan speed) - didn't have timing on, but I think about same time 
> as the next one.
> ALTER TABLE my_table ALTER COLUMN column1 SET NOT NULL;
> 
> 01:39 SLOW (table scan speed)
> ALTER TABLE my_table ALTER COLUMN column2 SET NOT NULL;
> 
> 00:22 - 1/4 time of table scan but still not instant like expected
> ALTER TABLE my_table ALTER COLUMN column3 SET NOT NULL;
> 
> 20.403 ms - instant, like expected
> ALTER TABLE my_table ALTER COLUMN column4 SET NOT NULL;

That the duration decreased every time may have been due to caching?
How big is the table vs RAM ?
Do you know if the SET NOT NULL blocked or not ?

Maybe something else had a nontrivial lock on the table, and those commands
were waiting on lock.  If you "SET deadlock_timeout='1'; SET
log_lock_waits=on;", then you could see that.

-- 
Justin




Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread John Bachir
Hi Sergei - I just used the recipe on my production database. I didn't observe 
all the expected benefits, I wonder if there were confounding factors or if I 
did something wrong. If you have time, I'd love to get your feedback. Let me 
know if you need more info. I'd love to write a blog post informing the world 
about this potentially game-changing feature!

Here are the commands I did, with some notes. All the columns are boolean. The 
table has about 8,600,000 rows.

This (blocking operation) was not fast, perhaps 60-100 seconds. maybe running 
them individually
would have been proportionally faster. but even then, not near-instant as 
expected.
or, maybe running them together had some sort of aggregate negative effect, so 
running them individually
would have been instant? I don't have much experience with such constraints.

ALTER TABLE my_table
  ADD CONSTRAINT my_table_column1_not_null CHECK (column1 IS NOT NULL) NOT 
VALID,
  ADD CONSTRAINT my_table_column2_not_null CHECK (column2 IS NOT NULL) NOT 
VALID,
  ADD CONSTRAINT my_table_column3_not_null CHECK (column3 IS NOT NULL) NOT 
VALID,
  ADD CONSTRAINT my_table_column4_not_null CHECK (column4 IS NOT NULL) NOT 
VALID;


as expected these took as long as a table scan, and as expected they did not 
block.

ALTER TABLE my_table validate CONSTRAINT my_table_column1_not_null;
ALTER TABLE my_table validate CONSTRAINT my_table_column2_not_null;
ALTER TABLE my_table validate CONSTRAINT my_table_column3_not_null;
ALTER TABLE my_table validate CONSTRAINT my_table_column4_not_null;


SLOW (table scan speed) - didn't have timing on, but I think about same time as 
the next one.
ALTER TABLE my_table ALTER COLUMN column1 SET NOT NULL;

01:39 SLOW (table scan speed)
ALTER TABLE my_table ALTER COLUMN column2 SET NOT NULL;

00:22 - 1/4 time of table scan but still not instant like expected
ALTER TABLE my_table ALTER COLUMN column3 SET NOT NULL;

20.403 ms - instant, like expected
ALTER TABLE my_table ALTER COLUMN column4 SET NOT NULL;


all < 100ms
ALTER TABLE my_table DROP CONSTRAINT my_table_column1_not_null;
ALTER TABLE my_table DROP CONSTRAINT my_table_column2_not_null;
ALTER TABLE my_table DROP CONSTRAINT my_table_column3_not_null;
ALTER TABLE my_table DROP CONSTRAINT my_table_column4_not_null;




Re: [PATCH] Fix install-tests target for vpath builds

2020-05-29 Thread Alvaro Herrera
On 2020-May-29, Andrew Dunstan wrote:

> I've come up with a slightly nicer version of your patch 1, which I
> propose to commit and backpatch before long.

Looks good to me.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] Fix install-tests target for vpath builds

2020-05-29 Thread Andrew Dunstan

On 4/23/20 12:55 AM, Craig Ringer wrote:
> Patch 0001 fixes this issue with vpath postgres builds:
>
> $ make -C src/test/regress install-tests
> /usr/bin/install: cannot create regular file
> 'PGPREFIX/lib/postgresql/regress/PGPREFIX/src/test/regress/expected/errors.out':
> No such file or directory
> make: *** [GNUmakefile:90: install-tests] Error 1
>
> (where PGPREFIX is your --prefix)
>
> It also makes the install-tests target a toplevel target for convenience.
>
> Three related bonus patches are attached in case anyone thinks they're
> a good idea:
>
> - 0002 changes the install location of src/test/regress's
> install-tests output files (sql/, expected/ etc) to
> $(pkglibdir)/pgxs/src/test/regress so that PGXS resolves it as
> $(top_srcdir)/src/test/regress, same as for in-tree builds. Presently
> it installs in $(pkglibdir)/regress/ for some reason. This patch
> applies on top of 0001. It will affect packagers.
>
> - 0003 makes the toplevel install-tests target also install
> src/test/isolation test resources and the test modules. This patch
> applies on top of either 0001 or 0002, depending on whether you want
> to include 0002.
>
> - 0004 makes the dummy 'check' target in pgxs.mk 
> optional for extensions that define the new PGXS
> variable NO_DUMMY_CHECK_TARGET . This lets extensions that want to
> define a 'check' target do so without having make complain at them
> about redefined targets. This patch is independent of the others and
> can apply on master directly.
>
>
>


I've come up with a slightly nicer version of your patch 1, which I
propose to commit and backpatch before long.


I'll leave the others for another day. Let's revisit after we get
through the release.


cheers


andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/GNUmakefile.in b/GNUmakefile.in
index 6d1b5a06c5..afdfd9f0a6 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -72,6 +72,7 @@ $(call recurse,check-world,src/test src/pl src/interfaces/ecpg contrib src/bin,c
 $(call recurse,checkprep,  src/test src/pl src/interfaces/ecpg contrib src/bin)
 
 $(call recurse,installcheck-world,src/test src/pl src/interfaces/ecpg contrib src/bin,installcheck)
+$(call recurse,install-tests,src/test/regress,install-tests)
 
 GNUmakefile: GNUmakefile.in $(top_builddir)/config.status
 	./config.status $@
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index 1a3164065f..c830627b00 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -87,8 +87,8 @@ regress_data_files = \
 
 install-tests: all install install-lib installdirs-tests
 	$(MAKE) -C $(top_builddir)/contrib/spi install
-	for file in $(regress_data_files); do \
-	  $(INSTALL_DATA) $$file '$(DESTDIR)$(pkglibdir)/regress/'$$file || exit; \
+	for file in $(subst $(srcdir)/,,$(regress_data_files)); do \
+		$(INSTALL_DATA) $(srcdir)/$$file '$(DESTDIR)$(pkglibdir)/regress/'$$file || exit; \
 	done
 
 installdirs-tests: installdirs


Re: Read access for pg_monitor to pg_replication_origin_status view

2020-05-29 Thread Martín Marqués
Hi,

> While working on some monitoring tasks I realised that the pg_monitor
> role doesn't have access to the pg_replication_origin_status.
>
> Are there any strong thoughts on not giving pg_monitor access to this
> view, or is it just something that nobody asked for yet? I can't find
> any reason for pg_monitor not to have access to it.

Further looking into this, I can see that the requirement of a
superuser to access/monify the replication origins is hardcoded in
backend/replication/logical/origin.c, so it's not a question of
GRANTing access to the pg_monitor user.

```
static void
replorigin_check_prerequisites(bool check_slots, bool recoveryOK)
{
if (!superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("only superusers can query or manipulate
replication origins")));

if (check_slots && max_replication_slots == 0)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg("cannot query or manipulate replication origin
when max_replication_slots = 0")));

if (!recoveryOK && RecoveryInProgress())
ereport(ERROR,
(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
 errmsg("cannot manipulate replication origins during
recovery")));

}
```

I believe we could skip the superuser() check for cases like
pg_replication_origin_session_progress() and
pg_replication_origin_progress().

Once option could be to add a third bool argument check_superuser to
replorigin_check_prerequisites() and have it set to false for the
functions which a none superuser could execute.

Patch attached


-- 
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From ade8b9a70afed900dddc4e93616499bb80966592 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mart=C3=ADn=20Marqu=C3=A9s?=
 
Date: Fri, 29 May 2020 15:45:27 -0300
Subject: [PATCH v1] First version of patch to give permission to
 pg_replication_origin_status()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Martín Marqués 
---
 src/backend/replication/logical/origin.c | 28 
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 981d60f135d..82698617236 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -180,9 +180,9 @@ static ReplicationState *session_replication_state = NULL;
 #define REPLICATION_STATE_MAGIC ((uint32) 0x1257DADE)
 
 static void
-replorigin_check_prerequisites(bool check_slots, bool recoveryOK)
+replorigin_check_prerequisites(bool check_slots, bool recoveryOK, bool check_superuser)
 {
-	if (!superuser())
+	if (check_superuser && !superuser())
 		ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
  errmsg("only superusers can query or manipulate replication origins")));
@@ -1225,7 +1225,7 @@ pg_replication_origin_create(PG_FUNCTION_ARGS)
 	char	   *name;
 	RepOriginId roident;
 
-	replorigin_check_prerequisites(false, false);
+	replorigin_check_prerequisites(false, false, true);
 
 	name = text_to_cstring((text *) DatumGetPointer(PG_GETARG_DATUM(0)));
 
@@ -1262,7 +1262,7 @@ pg_replication_origin_drop(PG_FUNCTION_ARGS)
 	char	   *name;
 	RepOriginId roident;
 
-	replorigin_check_prerequisites(false, false);
+	replorigin_check_prerequisites(false, false, true);
 
 	name = text_to_cstring((text *) DatumGetPointer(PG_GETARG_DATUM(0)));
 
@@ -1285,7 +1285,7 @@ pg_replication_origin_oid(PG_FUNCTION_ARGS)
 	char	   *name;
 	RepOriginId roident;
 
-	replorigin_check_prerequisites(false, false);
+	replorigin_check_prerequisites(false, false, true);
 
 	name = text_to_cstring((text *) DatumGetPointer(PG_GETARG_DATUM(0)));
 	roident = replorigin_by_name(name, true);
@@ -1306,7 +1306,7 @@ pg_replication_origin_session_setup(PG_FUNCTION_ARGS)
 	char	   *name;
 	RepOriginId origin;
 
-	replorigin_check_prerequisites(true, false);
+	replorigin_check_prerequisites(true, false, true);
 
 	name = text_to_cstring((text *) DatumGetPointer(PG_GETARG_DATUM(0)));
 	origin = replorigin_by_name(name, false);
@@ -1325,7 +1325,7 @@ pg_replication_origin_session_setup(PG_FUNCTION_ARGS)
 Datum
 pg_replication_origin_session_reset(PG_FUNCTION_ARGS)
 {
-	replorigin_check_prerequisites(true, false);
+	replorigin_check_prerequisites(true, false, true);
 
 	replorigin_session_reset();
 
@@ -1342,7 +1342,7 @@ pg_replication_origin_session_reset(PG_FUNCTION_ARGS)
 Datum
 pg_replication_origin_session_is_setup(PG_FUNCTION_ARGS)
 {
-	replorigin_check_prerequisites(false, false);
+	replorigin_check_prerequisites(false, false, true);
 
 	PG_RETURN_BOOL(replorigin_session_origin != InvalidRepOriginId);
 }
@@ -1361,7 +1361,7 @@ pg_replication_origin_session_progress(PG_FUNCTION_ARGS)
 	XLogRecPtr	remote_lsn = InvalidXLogRecPtr;
 	bool		

Re: Trouble with hashagg spill I/O pattern and costing

2020-05-29 Thread Jeff Davis
On Fri, 2020-05-29 at 15:04 +0200, Tomas Vondra wrote:
> Ah, right. Yeah, we only need to check for AGG_HASH here. Moreover,
> AGG_MIXED probably does not need the tlist tweak, because the input
> should be pre-sorted as with AGG_SORTED.
> 
> And we should probably do similar check in the
> create_groupinsets_path,
> I guess. At first I thought we can't do that before inspecting
> rollups,
> which only happens later in the function, but now I see there's
> aggstrategy in GroupingSetsPath too.

Looks good.

Jeff






Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 29, 2020 at 3:26 PM Tom Lane  wrote:
>> One missing part of that is that we'd need to support bind parameters
>> for utility statements, eg new password in ALTER USER.  That's been
>> on the wish list for a long time anyway, of course.  I think it's
>> mostly a matter of lack of round tuits, rather than any fundamental
>> problem.  (Parameters in transaction control statements might have
>> fundamental problems, but we can just dismiss that as out of scope.)

> I might be wrong, but isn't this, like, a ton of work?

I'm not sure how much work, but yeah, there'd be work to do.

I don't think there are all that many places where we really have
just a string literal (of course, ALTER USER PASSWORD is a poster
child exception).  I think a large fraction of the interesting cases are
places where there's some amount of expression eval capability already,
eg parameters in EXECUTE.  Syntactically you can already do

execute foo(1, $1 + 2);

and the only part of that that doesn't work is passing in a parameter.

regards, tom lane




Re: Default gucs for EXPLAIN

2020-05-29 Thread Robert Haas
On Tue, May 26, 2020 at 7:30 AM David Rowley  wrote:
> I'm against adding GUCs to control what EXPLAIN does by default.
>
> A few current GUCs come to mind which gives external control to a
> command's behaviour are:
>
> standard_conforming_strings
> backslash_quote
> bytea_output
>
> It's pretty difficult for application authors to write code that will
> just work due to these GUCs. We end up with GUCs like
> escape_string_warning to try and help application authors find areas
> which may be problematic.

I agree with this concern, as well as with what David says later,
namely that the concern is less here than in some other cases but
still not zero.

I do think the idea of changing the default for BUFFERS from OFF to ON
is a pretty good one, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Chapman Flack
On 05/29/20 15:26, Tom Lane wrote:

> all of the simpler cases I can think of: aside from the ALTER USER
> PASSWORD case, there's INSERT INTO accounts(..., creditcardnumber,
> ...) VALUES(..., $n, ...).  Neither one of those have a nearby UDF
> to control it with.

I was thinking incrementally ... something about UDFs only might
be quickish to do as a PoC. And is already useful, because if exposure
of a particular thing bothers you enough, you can make a UDF or P to
control it with.

But ultimately, if ALTER USER PASSWORD has sensitivity of
its parameter hardcoded in, and CREATE TABLE ACCOUNTS can declare
creditcardnumber SENSITIVE, then maybe those bits go out to the client
in the parameter Describe message, and come back in the Bind message,
without the user even necessarily thinking about it.

Regards,
-Chap




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Robert Haas
On Fri, May 29, 2020 at 3:26 PM Tom Lane  wrote:
> One missing part of that is that we'd need to support bind parameters
> for utility statements, eg new password in ALTER USER.  That's been
> on the wish list for a long time anyway, of course.  I think it's
> mostly a matter of lack of round tuits, rather than any fundamental
> problem.  (Parameters in transaction control statements might have
> fundamental problems, but we can just dismiss that as out of scope.)

I might be wrong, but isn't this, like, a ton of work? All the places
in the DDL grammar that currently accept string literals would have to
be changed to also allow parameters, and then the downstream code
would need to be adjusted to look through those parameters to find the
corresponding values. Or is there a less painful approach?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Robert Haas
On Fri, May 29, 2020 at 3:05 PM Chapman Flack  wrote:
> A possible step in the direction of good-enough would be to have
> 'sensitive' flags only in the parameter-bind message of the extended
> protocol.

Interesting idea. Changing the wire protocol for this sort of thing
makes it a much bigger lift, but it might help, at least in some
cases. It does however require that the user being using prepared
queries, and that the user know which data is sensitive. For instance,
if a user is sitting there in a psql shell and the requirement is that
if they happen to type ALTER USER .. PASSWORD the new password doesn't
get logged, this approach fails both because the user doesn't do
anything to identify the query as special, and also because it's not
prepared. So in a certain sense you could say that with this design
the server just passes the buck: it's too hard for us to figure out
which things to log, so we're making it your job to tell us

Another point to consider is that I think we already have the ability
to suppress logging of bind parameters. So, people who want this sort
of thing and are able to use bind parameters can just not log them and
then all is well.That does have the deficiency that we then log NO
bind parameters, and somebody might want to log them in some
situations but not others, but perhaps there is a simpler way of
accomplishing that than what you've outlined here? I don't know. I'm
just throwing out ideas...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Tom Lane
Chapman Flack  writes:
> A possible step in the direction of good-enough would be to have
> 'sensitive' flags only in the parameter-bind message of the extended
> protocol.

Yeah, if we restrict our promises to being willing to not log (some)
bind parameters, that seems far more manageable, and reliable, than
figuring out what parts of query strings need to be hidden.

One missing part of that is that we'd need to support bind parameters
for utility statements, eg new password in ALTER USER.  That's been
on the wish list for a long time anyway, of course.  I think it's
mostly a matter of lack of round tuits, rather than any fundamental
problem.  (Parameters in transaction control statements might have
fundamental problems, but we can just dismiss that as out of scope.)

I don't have a lot of faith (not any, really) in your notion of tying
control of the feature to function arguments.  That falls down for
all of the simpler cases I can think of: aside from the ALTER USER
PASSWORD case, there's INSERT INTO accounts(..., creditcardnumber,
...) VALUES(..., $n, ...).  Neither one of those have a nearby UDF
to control it with.  Also, what would we do with something like
sensitive_function($1 || 'constant') ?

regards, tom lane




Re: Internal key management system

2020-05-29 Thread Robert Haas
On Fri, May 29, 2020 at 1:50 AM Masahiko Sawada
 wrote:
> However, this usage has a downside that user secret can be logged to
> server logs when log_statement = 'all' or an error happens. To deal
> with this issue I've created a PoC patch on top of the key manager
> patch which adds a libpq function PQencrypt() to encrypt data and new
> psql meta-command named \encrypt in order to  encrypt data while
> eliminating the possibility of the user data being logged.
> PQencrypt() just calls pg_encrypt() via PQfn(). Using this command the
> above example can become as follows:

If PQfn() calls aren't currently logged, that's probably more of an
oversight due to the feature being almost dead than something upon
which we want to rely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Chapman Flack
On 05/29/20 14:51, Robert Haas wrote:
> stuff that works today. We have options to log queries before parsing
> them. You can't redact sensitive details at that point because you
> don't know whether the query contains any such details, or where they
> are located. You can't postpone logging the query without changing the
> behavior of the system in user-visible ways. What if parsing gets
> stuck in an infinite loop and you want to look at the log to see which
> query got stuck? What if you want to look at the timestamps on the log

A possible step in the direction of good-enough would be to have
'sensitive' flags only in the parameter-bind message of the extended
protocol.

The query you send down has only placeholders. The sensitive stuff
comes in the bind message. Recognizing a 'sensitive' bit for a parameter
in a bind message ought to be entirely feasible. If parsing the query
failed and got logged because of the failure, the log record just
shows the placeholders.

Again, to protect from the other kind of misuse (a malicious query
hiding values from the log), there should be a check after parsing/
analysis that the values sent with 'sensitive' flags in the bind
message are exactly the ones that were so declared.

A stored function/procedure might have a way to declare some of its
parameters that way, with some default role controlling permission
to declare such a function.

Maybe you could even declare such parameters directly in the syntax
of preparing a query (with, again, control over who gets to prepare
such a query). There again, the correspondence of those declarations
to the bind message can't be verified until after parsing/analysis,
but meanwhile the sensitive values are clearly flagged in the bind
message, and any logging on a parse failure is still able to avoid
logging them.

Regards,
-Chap




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Robert Haas
On Fri, May 29, 2020 at 12:38 PM Chapman Flack  wrote:
> Just giving this thread a bump in honor of the mention of sensitive
> things in logs in the cryptography unconference session.

I agree that there's a problem here and I would like to see a solution
to it, too. Several EnterpriseDB customers have complained about the
fact that ALTER USER .. PASSWORD logs the password, which is a similar
kind of problem, though it involves DDL rather than DML.

But, I don't really see a way forward that doesn't involve breaking
stuff that works today. We have options to log queries before parsing
them. You can't redact sensitive details at that point because you
don't know whether the query contains any such details, or where they
are located. You can't postpone logging the query without changing the
behavior of the system in user-visible ways. What if parsing gets
stuck in an infinite loop and you want to look at the log to see which
query got stuck? What if you want to look at the timestamps on the log
messages to know when the server started dealing with some particular
query?

On a related note, what if parsing fails? Not only should the logging
that happens before parsing already have been done, but the parse
error itself should also log the query that triggered it, so that the
user has some chance of troubleshooting the situation. However, since
we haven't completed parsing, there's no way to redact what gets
logged in such a case; we can't know where the sensitive stuff is.
Now, you might say that you only care about redacting queries that are
well-formed, but parsing can fail for other reasons - e.g. because you
ran out of memory. You could argue that such cases are unlikely enough
that you don't care about them, but that doesn't mean somebody else
won't file a CVE.

Also, even if you are logging after parsing has been successfully
completed, does that mean that you know which parts of the query
contain sensitive information? Parse trees contain some positional
information, but I'm not sure that there's enough information there,
or that we have it in all the cases somebody might want to redact.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: proposal: possibility to read dumped table's name from file

2020-05-29 Thread Justin Pryzby
On Fri, May 29, 2020 at 04:21:00PM +0200, Pavel Stehule wrote:
> one my customer has to specify dumped tables name by name. After years and
> increasing database size and table numbers he has problem with too short
> command line. He need to read the list of tables from file (or from stdin).

+1 - we would use this.

We put a regex (actually a pg_dump pattern) of tables to skip (timeseries
partitions which are older than a few days and which are also dumped once not
expected to change, and typically not redumped).  We're nowhere near the
execve() limit, but it'd be nice if the command was primarily a list of options
and not a long regex.

Please also support reading from file for --exclude-table=pattern.

I'm drawing a parallel between this and rsync --include/--exclude and --filter.

We'd be implementing a new --filter, which might have similar syntax to rsync
(which I always forget).

-- 
Justin




Re: Expand the use of check_canonical_path() for more GUCs

2020-05-29 Thread Alvaro Herrera
On 2020-May-28, Mark Dilger wrote:

> A little poking around shows that in SelectConfigFiles(), these four
> directories were set by SetConfigOption().  I don't see a problem with
> the code, but the way this stuff is spread around makes it easy for
> somebody adding a new GUC file path to do it wrong.  I don't have much
> opinion about Peter's preference that paths be left alone, but I'd
> prefer some comments in guc.c explaining it all.  The only cleanup
> that occurs to me is to reorder ConfigureNamesString[] to have all the
> path options back-to-back, with the four that are set by
> SelectConfigFiles() at the top with comments about why they are
> special, and the rest after that with comments about why they need or
> do not need canonicalization.

No need for reorganization, I think, just have a comment on top of each
entry that doesn't use canonicalization such as "no canonicalization,
as explained in ..." where that refers to a single largish comment that
explains what canonicalization is, why you use it, and why you don't.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_dump fail to not dump public schema orders

2020-05-29 Thread David G. Johnston
On Fri, May 29, 2020 at 7:42 AM Adrien Nayrat 
wrote:

> On 5/29/20 3:56 PM, David G. Johnston wrote:
> > On Friday, May 29, 2020, Adrien Nayrat  > > wrote:
> >
> > Hello,
> >
> > I noticed pg_dump failed to not dump creation or comment commands
> for public
> > schema when we explicitly ask it to dump public schema.
> >
> > Shorter example: pg_dump -n public dump will give:
> >
> >
> >
> > [Create schema public]
> >
> >
> > As far as I can tell this is working as intended/documented.  The public
> schema
> > doesn’t and doesn’t and shouldn’t get special treatment relative to any
> other
> > user schema here.
> >
>
> I am not sure. See this comment from selectDumpableNamespace:
>
>
That comment doesn't apply to this situation as it is attached to an
if/else branch that doesn't handle the "-n" option case.


> FYI this behavior appeared with pg11. With pg10 you won't have "CREATE
> SCHEMA
> public" orders.
>

That matches what Tom said.
It is indeed a behavior change and the commit that caused it to change
didn't change the documentation - so either the current behavior is a bug
or the old documentation is wrong for failing to describe the old behavior
sufficiently.

I stand by my comment that the current behavior and documentation agree -
it doesn't call out any special behavior for the public schema being
specified in "-n" and none is observed (now).

I'm tending to believe that the code benefits that resulted in this change
are sufficient to keep new behavior as-is and not go back and introduce
special public schema handling code to get it back to the way things were.
The public schema has issues and at this point the only reason it should
exist and be populated in a database is for learning or quick debugging.
Its not worth breaking stuff to make that point more bluntly but if the
natural evolution of the code results in people either adapting or
abandoning the public schema I find that to be an acceptable price for
progress.

David J.


Re: Expand the use of check_canonical_path() for more GUCs

2020-05-29 Thread Robert Haas
On Tue, May 19, 2020 at 7:02 AM Peter Eisentraut
 wrote:
> That thread didn't resolve why check_canonical_path() is necessary
> there.  Maybe the existing uses could be removed?

This first sentence of this reply seems worthy of particualr
attention. We have to know what problem this is intended to fix before
we try to decide in which cases it's needed. Otherwise, whether we add
it everywhere or remove it everywhere, we'll only know that it's
consistent, not that it's correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pie-in-sky idea: 'sensitive' function parameters

2020-05-29 Thread Chapman Flack
Just giving this thread a bump in honor of the mention of sensitive
things in logs in the cryptography unconference session.

I'm not partisan about any of the particular syntax examples I gave
earlier, but it seems like there were two key ingredients:

1.  In what is sent from the client to the server, certain parameters
are marked as sensitive in ways that are obvious early at parse time,
before having to look up or analyze anything.

2.  But those markings are later compared to the actual declarations of
things, once those are resolved. It is an error either to send as
'sensitive' a parameter that isn't so declared, or to forget to send
as 'sensitive' one that is.

The first error is to prevent evildoers using the facility to hide values
from the logs arbitrarily in the queries they are sending.

The second error is to catch mistakes during app development. It's possible
that a query sent by an app under development won't have the right things
marked 'sensitive' yet, and it's possible those queries get exposed in the
logs because the early parse-time indication that they shouldn't be is
missing.

But at that stage of development, the values being sent shouldn't really
be sensitive ones, and making such a query an error ensures such omissions
are caught and fixed.

Regards,
-Chap




Re: proposal: possibility to read dumped table's name from file

2020-05-29 Thread Pavel Stehule
pá 29. 5. 2020 v 18:03 odesílatel Tom Lane  napsal:

> David Fetter  writes:
> > Would it make sense to expand this patch to handle other objects?
>

Sure. Just we should to design system (and names of options).


>
> If we're gonna do something like this, +1 for being more general.
> The fact that pg_dump only has selection switches for tables and
> schemas has always struck me as an omission.
>

a implementation is trivial, hard is good design of names for these options.

Pavel




>
> regards, tom lane
>


Re: speed up unicode normalization quick check

2020-05-29 Thread Mark Dilger



> On May 28, 2020, at 8:54 PM, John Naylor  wrote:
> 
> On Fri, May 29, 2020 at 5:59 AM Mark Dilger
>  wrote:
>> 
>>> On May 21, 2020, at 12:12 AM, John Naylor  
>>> wrote:
> 
>>> very picky in general. As a test, it also successfully finds a
>>> function for the OS "words" file, the "D" sets of codepoints, and for
>>> sets of the first n built-in OIDs, where n > 5.
>> 
>> Prior to this patch, src/tools/gen_keywordlist.pl is the only script that 
>> uses PerfectHash.  Your patch adds a second.  I'm not convinced that 
>> modifying the PerfectHash code directly each time a new caller needs 
>> different multipliers is the right way to go.

I forgot in my first round of code review to mention, "thanks for the patch".  
I generally like what you are doing here, and am trying to review it so it gets 
committed.

> Calling it "each time" with a sample size of two is a bit of a
> stretch. The first implementation made a reasonable attempt to suit
> future uses and I simply made it a bit more robust. In the text quoted
> above you can see I tested some scenarios beyond the current use
> cases, with key set sizes as low as 6 and as high as 250k.

I don't really have an objection to what you did in the patch.  I'm not going 
to lose any sleep if it gets committed this way.

The reason I gave this feedback is that I saved the *kwlist_d.h files generated 
before applying the patch, and compared them with the same files generated 
after applying the patch, and noticed a very slight degradation.  Most of the 
files changed without any expansion, but the largest of them, 
src/common/kwlist_d.h, changed from

  static const int16 h[901]

to

  static const int16 h[902]

suggesting that even with your reworking of the parameters for PerfectHash, you 
weren't able to find a single set of numbers that worked for the two datasets 
quite as well as different sets of numbers each tailored for a particular data 
set.  I started to imagine that if we wanted to use PerfectHash for yet more 
stuff, the problem of finding numbers that worked across all N data sets (even 
if N is only 3 or 4) might be harder still.  That's all I was referring to.  
901 -> 902 is such a small expansion that it might not be worth worrying about.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: proposal: possibility to read dumped table's name from file

2020-05-29 Thread Tom Lane
David Fetter  writes:
> Would it make sense to expand this patch to handle other objects?

If we're gonna do something like this, +1 for being more general.
The fact that pg_dump only has selection switches for tables and
schemas has always struck me as an omission.

regards, tom lane




Re: PG function with pseudotype "anyelement" for IN, OUT parameter shows wrong behaviour.

2020-05-29 Thread Prabhat Sahu
On Fri, May 29, 2020 at 8:30 PM Pavel Stehule 
wrote:

>
>
> pá 29. 5. 2020 v 16:45 odesílatel Prabhat Sahu <
> prabhat.s...@enterprisedb.com> napsal:
>
>> Hi All,
>>
>> Please check the below scenario, with pseudotype "anyelement" for IN, OUT
>> parameter and the RETURN record in a function.
>>
>> postgres=# create table tab1(c1 int, c2 int, c3 timestamp) ;
>> CREATE TABLE
>> postgres=# CREATE OR REPLACE FUNCTION func_any(IN anyelement, IN
>> anyelement, OUT v1 anyelement, OUT v2 anyelement)
>> RETURNS record
>> AS
>> $$
>> BEGIN
>>   SELECT $1 + 1, $2 + 1 into v1, v2;
>>   insert into tab1 values(v1, v2, now());
>> END;
>> $$
>> language 'plpgsql';
>> CREATE FUNCTION
>> postgres=# SELECT (func_any(1, 2)).*;
>>  v1 | v2
>> +
>>   2 |  3
>> (1 row)
>>
>> postgres=# select * from tab1;
>>  c1 | c2 | c3
>> ++
>>   2 |  3 | 2020-05-30 19:26:32.036924
>>   2 |  3 | 2020-05-30 19:26:32.036924
>> (2 rows)
>>
>> I hope, the table "tab1" should have only a single record, but we are
>> able to see 2 records in tab1.
>>
>
> it is correct, because you use composite unpacking syntax
>
> SELECT (func_any(1, 2)).*;
>
> means
>
> SELECT (func_any(1, 2)).c1, (func_any(1, 2)).c2;
>
> If you don't want double execution, you should to run your function in
> FROM clause
>
> postgres=# SELECT * FROM func_any(1, 2);
> ┌┬┐
> │ v1 │ v2 │
> ╞╪╡
> │  2 │  3 │
> └┴┘
> (1 row)
>

Thanks Pavel, for the help, I have verified the same, Now I am getting a
single record in tab1.
postgres=# SELECT func_any(1, 2);
 func_any
--
 (2,3)
(1 row)

postgres=# select * from tab1;
 c1 | c2 | c3
++
  2 |  3 | 2020-05-30 20:17:59.989087
(1 row)
Thanks,
Prabhat Sahu


Re: proposal: possibility to read dumped table's name from file

2020-05-29 Thread David Fetter
On Fri, May 29, 2020 at 04:21:00PM +0200, Pavel Stehule wrote:
> Hi
> 
> one my customer has to specify dumped tables name by name. After years and
> increasing database size and table numbers he has problem with too short
> command line. He need to read the list of tables from file (or from stdin).
> 
> I wrote simple PoC patch
> 
> Comments, notes, ideas?

This seems like a handy addition. What I've done in cases similar to
this was to use `grep -f` on the output of `pg_dump -l` to create a
file suitable for `pg_dump -L`, or mash them together like this:

pg_restore -L <(pg_dump -l /path/to/dumpfile | grep -f /path/to/listfile) 
-d new_db /path/to/dumpfile

That's a lot of shell magic and obscure corners of commands to expect
people to use.

Would it make sense to expand this patch to handle other objects?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-05-29 Thread Dilip Kumar
On Tue, May 26, 2020 at 4:46 PM Amit Kapila  wrote:
>
> On Tue, May 26, 2020 at 2:44 PM Dilip Kumar  wrote:
> >
> > On Tue, May 26, 2020 at 10:27 AM Amit Kapila  
> > wrote:
> > >
> > > >
> > > > 2. There is a bug fix in handling the stream abort in 0008 (earlier it
> > > > was 0006).
> > > >
> > >
> > > The code changes look fine but it is not clear what was the exact
> > > issue.  Can you explain?
> >
> > Basically, in case of an empty subtransaction, we were reading the
> > subxacts info but when we could not find the subxid in the subxacts
> > info we were not releasing the memory.  So on next subxact_info_read
> > it will expect that subxacts should be freed but we did not free it in
> > that !found case.
> >
>
> Okay, on looking at it again, the same code exists in
> subxact_info_write as well.  It is better to have a function for it.
> Can we have a structure like SubXactContext for all the variables used
> for subxact?  As mentioned earlier I find the allocation/deallocation
> of subxacts a bit ad-hoc, so there will always be a chance that we can
> forget to free it.  Having it allocated in memory context which we can
> reset later might reduce that risk.  One idea could be that we have a
> special memory context for start and stop messages which can be used
> to allocate the subxacts there.  In case of commit/abort, we can allow
> subxacts information to be allocated in ApplyMessageContext which is
> reset at the end of each protocol message.

Changed as per this.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-05-29 Thread Dilip Kumar
On Tue, May 26, 2020 at 3:04 PM Amit Kapila  wrote:
>
> On Mon, May 25, 2020 at 8:07 PM Dilip Kumar  wrote:
> >
> > On Fri, May 22, 2020 at 11:54 AM Amit Kapila  
> > wrote:
> > >
> > > 4.
> > > + * XXX Do we need to allocate it in TopMemoryContext?
> > > + */
> > > +static void
> > > +subxact_info_add(TransactionId xid)
> > > {
> > > ..
> > >
> > > For this and other places in a patch like in function
> > > stream_open_file(), instead of using TopMemoryContext, can we consider
> > > using a new memory context LogicalStreamingContext or something like
> > > that. We can create LogicalStreamingContext under TopMemoryContext.  I
> > > don't see any need of using TopMemoryContext here.
> >
> > But, when we will delete/reset the LogicalStreamingContext?
> >
>
> Why can't we reset it at each stream stop message?

Done this

>
> >  because
> > we are planning to keep this memory until the worker is alive so that
> > supposed to be the top memory context.
> >
>
> Which part of allocation do we want to keep till the worker is alive?
> Why we need memory-related to subxacts till the worker is alive?  As
> we have now, after reading subxact info (subxact_info_read), we need
> to ensure that it is freed after its usage due to which we need to
> remember and perform pfree at various places.
>
> I think we should once see the possibility that such that we could
> switch to this new context in start stream message and reset it in
> stop stream message.  That might help in avoiding
> MemoryContextSwitchTo TopMemoryContext at various places.
>
> >  If we create any other context
> > with the same life span as TopMemoryContext then what is the point?
> >
>
> It is helpful for debugging.  It is recommended that we don't use the
> top memory context unless it is really required.  Read about it in
> src/backend/utils/mmgr/README.

xids is now allocated in ApplyContext

> > > 8.
> > > + * XXX Maybe we should only include the checksum when the cluster is
> > > + * initialized with checksums?
> > > + */
> > > +static void
> > > +subxact_info_write(Oid subid, TransactionId xid)
> > >
> > > Do we really need to have the checksum for temporary files? I have
> > > checked a few other similar cases like SharedFileSet stuff for
> > > parallel hash join but didn't find them using checksums.  Can you also
> > > once see other usages of temporary files and then let us decide if we
> > > see any reason to have checksums for this?
> >
> > Yeah, even I can see other places checksum is not used.
> >
>
> So, unless someone speaks up before you are ready for the next version
> of the patch, can we remove it?

Done
> > > Another point is we don't seem to be doing this for 'changes' file,
> > > see stream_write_change.  So, not sure, there is any sense to write
> > > checksum for subxact file.
> >
> > I can see there are comment atop this function
> >
> > * XXX The subxact file includes CRC32C of the contents. Maybe we should
> > * include something like that here too, but doing so will not be as
> > * straighforward, because we write the file in chunks.
> >
>
> You can remove this comment as well.  I don't know how advantageous it
> is to checksum temporary files.  We can anyway add it later if there
> is a reason for doing so.

Done

> >
> > > 12.
> > > maybe_send_schema()
> > > {
> > > ..
> > > + if (in_streaming)
> > > + {
> > > + /*
> > > + * TOCHECK: We have to send schema after each catalog change and it may
> > > + * occur when streaming already started, so we have to track new catalog
> > > + * changes somehow.
> > > + */
> > > + schema_sent = get_schema_sent_in_streamed_txn(relentry, topxid);
> > > ..
> > > ..
> > > }
> > >
> > > I think it is good to once verify/test what this comment says but as
> > > per code we should be sending the schema after each catalog change as
> > > we invalidate the streamed_txns list in rel_sync_cache_relation_cb
> > > which must be called during relcache invalidation.  Do we see any
> > > problem with that mechanism?
> >
> > I have tested this, I think we are already sending the schema after
> > each catalog change.
> >
>
> Then remove "TOCHECK" in the above comment.

Done


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-05-29 Thread Dilip Kumar
On Thu, May 28, 2020 at 2:41 PM Amit Kapila  wrote:
>
> On Thu, May 28, 2020 at 12:46 PM Dilip Kumar  wrote:
> >
> > On Tue, May 26, 2020 at 12:00 PM Amit Kapila  
> > wrote:
> > >
> > > Isn't this problem only for subxact file as we anyway create changes
> > > file as part of start stream message which should have come after
> > > abort?  If so, can't we detect whether subxact file exists probably by
> > > using nsubxacts or something like that?  Can you please once try to
> > > reproduce this scenario to ensure that we are not missing anything?
> >
> > I have tested this, as of now, by default we create both changes and
> > subxact files irrespective of whether we get any subtransactions or
> > not.  Maybe this could be optimized that only if we have any subxact
> > then only create that file otherwise not?  What's your opinion on the
> > same.
> >
>
> Yeah, that makes sense.
>
> > > > > > > 8.
> > > > > > > @@ -2295,6 +2677,13 @@ 
> > > > > > > ReorderBufferXidSetCatalogChanges(ReorderBuffer
> > > > > > > *rb, TransactionId xid,
> > > > > > >   txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> > > > > > >
> > > > > > >   txn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > > > > > +
> > > > > > > + /*
> > > > > > > + * TOCHECK: Mark toplevel transaction as having catalog changes 
> > > > > > > too
> > > > > > > + * if one of its children has.
> > > > > > > + */
> > > > > > > + if (txn->toptxn != NULL)
> > > > > > > + txn->toptxn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > > > > >  }
> > > > > > >
> > > > > > > Why are we marking top transaction here?
> > > > > >
> > > > > > We need to mark top transaction to decide whether to build tuplecid
> > > > > > hash or not.  In non-streaming mode, we are only sending during the
> > > > > > commit time, and during commit time we know whether the top
> > > > > > transaction has any catalog changes or not based on the invalidation
> > > > > > message so we are marking the top transaction there in DecodeCommit.
> > > > > > Since here we are not waiting till commit so we need to mark the top
> > > > > > transaction as soon as we mark any of its child transactions.
> > > > > >
> > > > >
> > > > > But how does it help?  We use this flag (via
> > > > > ReorderBufferXidHasCatalogChanges) in SnapBuildCommitTxn which is
> > > > > anyway done in DecodeCommit and that too after setting this flag for
> > > > > the top transaction if required.  So, how will it help in setting it
> > > > > while processing for subxid.  Also, even if we have to do it won't it
> > > > > add the xid needlessly in builder->committed.xip array?
> > > >
> > > > In ReorderBufferBuildTupleCidHash, we use this flag to decide whether
> > > > to build the tuplecid hash or not based on whether it has catalog
> > > > changes or not.
> > > >
> > >
> > > Okay, but you haven't answered the second part of the question: "won't
> > > it add the xid of top transaction needlessly in builder->committed.xip
> > > array, see function SnapBuildCommitTxn?"  IIUC, this can happen
> > > without patch as well because DecodeCommit also sets the flags just
> > > based on invalidation messages irrespective of whether the messages
> > > are generated by top transaction or not, is that right?
> >
> > Yes, with or without the patch it always adds the topxid.  I think
> > purpose for doing this with/without patch is not for the snapshot
> > instead we are marking the top itself that some of its subtxn has the
> > catalog changes so that while building the tuplecid has we can know
> > whether to build the hash or not.  But, having said that I feel in
> > ReorderBufferBuildTupleCidHash why do we need these two checks
> > if (!rbtxn_has_catalog_changes(txn) || dlist_is_empty(>tuplecids))
> > return;
> >
> > I mean it should be enough to just have the check,  because if we have
> > added something to the tuplecids then catalog changes must be there
> > because that time we are setting the catalog changes to true.
> >
> > if (dlist_is_empty(>tuplecids))
> > return;
> >
> > I think in the base code there are multiple things going on
> > 1. If we get new CID we always set the catalog change in that
> > transaction but add the tuplecids in the top transaction.  So
> > basically, top transaction is so far not marked with catalog changes
> > but it has tuplecids.
> > 2. Now, in DecodeCommit the top xid will be marked that it has catalog
> > changes based on the invalidation messages.
> >
>
> I don't think it is advisable to remove that check from base code
> unless we have a strong reason for doing so.  I think here you can
> write better comments about why you are marking the flag for top
> transaction and remove TOCHECK from the comment.

Done.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: PG function with pseudotype "anyelement" for IN, OUT parameter shows wrong behaviour.

2020-05-29 Thread Pavel Stehule
pá 29. 5. 2020 v 16:45 odesílatel Prabhat Sahu <
prabhat.s...@enterprisedb.com> napsal:

> Hi All,
>
> Please check the below scenario, with pseudotype "anyelement" for IN, OUT
> parameter and the RETURN record in a function.
>
> postgres=# create table tab1(c1 int, c2 int, c3 timestamp) ;
> CREATE TABLE
> postgres=# CREATE OR REPLACE FUNCTION func_any(IN anyelement, IN
> anyelement, OUT v1 anyelement, OUT v2 anyelement)
> RETURNS record
> AS
> $$
> BEGIN
>   SELECT $1 + 1, $2 + 1 into v1, v2;
>   insert into tab1 values(v1, v2, now());
> END;
> $$
> language 'plpgsql';
> CREATE FUNCTION
> postgres=# SELECT (func_any(1, 2)).*;
>  v1 | v2
> +
>   2 |  3
> (1 row)
>
> postgres=# select * from tab1;
>  c1 | c2 | c3
> ++
>   2 |  3 | 2020-05-30 19:26:32.036924
>   2 |  3 | 2020-05-30 19:26:32.036924
> (2 rows)
>
> I hope, the table "tab1" should have only a single record, but we are able
> to see 2 records in tab1.
>

it is correct, because you use composite unpacking syntax

SELECT (func_any(1, 2)).*;

means

SELECT (func_any(1, 2)).c1, (func_any(1, 2)).c2;

If you don't want double execution, you should to run your function in FROM
clause

postgres=# SELECT * FROM func_any(1, 2);
┌┬┐
│ v1 │ v2 │
╞╪╡
│  2 │  3 │
└┴┘
(1 row)

Regards

Pavel



> --
>
> With Regards,
> Prabhat Kumar Sahu
> EnterpriseDB: http://www.enterprisedb.com
>


Re: proposal: possibility to read dumped table's name from file

2020-05-29 Thread Pavel Stehule
pá 29. 5. 2020 v 16:28 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > one my customer has to specify dumped tables name by name. After years
> and
> > increasing database size and table numbers he has problem with too short
> > command line. He need to read the list of tables from file (or from
> stdin).
>
> I guess the question is why.  That seems like an enormously error-prone
> approach.  Can't they switch to selecting schemas?  Or excluding the
> hopefully-short list of tables they don't want?
>

It is not typical application. It is a analytic application when the schema
of database is  based on dynamic specification of end user (end user can do
customization every time). So schema is very dynamic.

For example - typical server has about four thousand databases and every
database has some between 1K .. 10K tables.

Another specific are different versions of data in different tables. A user
can work with one set of data (one set of tables) and a application
prepares new set of data (new set of tables). Load can be slow, because
sometimes bigger tables are filled (about forty GB). pg_dump backups one
set of tables (little bit like snapshot of data). So it is strange OLAP
(but successfull) application.

Regards

Pavel



> regards, tom lane
>


PG function with pseudotype "anyelement" for IN, OUT parameter shows wrong behaviour.

2020-05-29 Thread Prabhat Sahu
Hi All,

Please check the below scenario, with pseudotype "anyelement" for IN, OUT
parameter and the RETURN record in a function.

postgres=# create table tab1(c1 int, c2 int, c3 timestamp) ;
CREATE TABLE
postgres=# CREATE OR REPLACE FUNCTION func_any(IN anyelement, IN
anyelement, OUT v1 anyelement, OUT v2 anyelement)
RETURNS record
AS
$$
BEGIN
  SELECT $1 + 1, $2 + 1 into v1, v2;
  insert into tab1 values(v1, v2, now());
END;
$$
language 'plpgsql';
CREATE FUNCTION
postgres=# SELECT (func_any(1, 2)).*;
 v1 | v2
+
  2 |  3
(1 row)

postgres=# select * from tab1;
 c1 | c2 | c3
++
  2 |  3 | 2020-05-30 19:26:32.036924
  2 |  3 | 2020-05-30 19:26:32.036924
(2 rows)

I hope, the table "tab1" should have only a single record, but we are able
to see 2 records in tab1.

-- 

With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com


Re: pg_dump fail to not dump public schema orders

2020-05-29 Thread Adrien Nayrat
On 5/29/20 3:56 PM, David G. Johnston wrote:
> On Friday, May 29, 2020, Adrien Nayrat  > wrote:
> 
> Hello,
> 
> I noticed pg_dump failed to not dump creation or comment commands for 
> public
> schema when we explicitly ask it to dump public schema.
> 
> Shorter example: pg_dump -n public dump will give:
> 
>  
> 
> [Create schema public]
> 
>  
> As far as I can tell this is working as intended/documented.  The public 
> schema
> doesn’t and doesn’t and shouldn’t get special treatment relative to any other
> user schema here.
> 

I am not sure. See this comment from selectDumpableNamespace:

/*
 * The public schema is a strange beast that sits in a sort of
 * no-mans-land between being a system object and a user object.  We
 * don't want to dump creation or comment commands for it, because
 * that complicates matters for non-superuser use of pg_dump.  But we
 * should dump any ACL changes that have occurred for it, and of
 * course we should dump contained objects.
 */


FYI this behavior appeared with pg11. With pg10 you won't have "CREATE SCHEMA
public" orders.

Regards,




Re: pg_dump fail to not dump public schema orders

2020-05-29 Thread Tom Lane
"David G. Johnston"  writes:
> On Friday, May 29, 2020, Adrien Nayrat  wrote:
>> I noticed pg_dump failed to not dump creation or comment commands for
>> public schema when we explicitly ask it to dump public schema.

> As far as I can tell this is working as intended/documented.  The public
> schema doesn’t and doesn’t and shouldn’t get special treatment relative to
> any other user schema here.

Note this is something we intentionally changed a little while ago
(v11, looks like), along with a larger refactoring of pg_dump vs.
pg_dumpall.  But yeah, public is not treated differently from other
schemas anymore.

regards, tom lane




Re: proposal: possibility to read dumped table's name from file

2020-05-29 Thread Tom Lane
Pavel Stehule  writes:
> one my customer has to specify dumped tables name by name. After years and
> increasing database size and table numbers he has problem with too short
> command line. He need to read the list of tables from file (or from stdin).

I guess the question is why.  That seems like an enormously error-prone
approach.  Can't they switch to selecting schemas?  Or excluding the
hopefully-short list of tables they don't want?

regards, tom lane




proposal: possibility to read dumped table's name from file

2020-05-29 Thread Pavel Stehule
Hi

one my customer has to specify dumped tables name by name. After years and
increasing database size and table numbers he has problem with too short
command line. He need to read the list of tables from file (or from stdin).

I wrote simple PoC patch

Comments, notes, ideas?

Regards

Pavel
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bd14d38740..d8330ec32a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -362,6 +362,7 @@ main(int argc, char **argv)
 		{"serializable-deferrable", no_argument, _deferrable, 1},
 		{"snapshot", required_argument, NULL, 6},
 		{"strict-names", no_argument, _names, 1},
+		{"table-names-from-file", required_argument, NULL, 8},
 		{"use-set-session-authorization", no_argument, _setsessauth, 1},
 		{"no-publications", no_argument, _publications, 1},
 		{"no-security-labels", no_argument, _security_labels, 1},
@@ -553,6 +554,59 @@ main(int argc, char **argv)
 dosync = false;
 break;
 
+			case 8:/* read table names from file */
+{
+	FILE	   *f;
+	char	   *line;
+	size_t		line_size = 1024;
+	ssize_t		chars;
+	bool		use_stdin = false;
+
+	if (strcmp(optarg, "-") != 0)
+	{
+		f = fopen(optarg, "r");
+		if (!f)
+		{
+			fprintf(stderr, _("%s: could not open the input file \"%s\": %s\n"),
+progname, optarg, strerror(errno));
+			exit_nicely(1);
+		}
+	}
+	else
+	{
+		f = stdin;
+		use_stdin = true;
+	}
+
+	line = malloc(line_size);
+
+	while ((chars = getline(, _size, f)) != -1)
+	{
+		if (line[chars - 1] == '\n')
+			line[chars - 1] = '\0';
+
+		/* ignore empty rows */
+		if (*line != '\0')
+		{
+			simple_string_list_append(_include_patterns, line);
+			dopt.include_everything = false;
+		}
+	}
+
+	if (ferror(f))
+	{
+		fprintf(stderr, _("%s: could not read from file \"%s\": %s\n"),
+			progname, use_stdin ? "stdin" : optarg, strerror(errno));
+		exit_nicely(1);
+	}
+
+	if (!use_stdin)
+		fclose(f);
+
+	free(line);
+}
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -977,6 +1031,8 @@ help(const char *progname)
 	printf(_("  --snapshot=SNAPSHOT  use given snapshot for the dump\n"));
 	printf(_("  --strict-names   require table and/or schema include patterns to\n"
 			 "   match at least one entity each\n"));
+	printf(_("  --table-names-from-file=FILENAME\n"
+			 "   read table's names from file (one table name per line)\n"));
 	printf(_("  --use-set-session-authorization\n"
 			 "   use SET SESSION AUTHORIZATION commands instead of\n"
 			 "   ALTER OWNER commands to set ownership\n"));


Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.

2020-05-29 Thread Andy Fan
> The other serious error we could be making here is to change things on
>> the basis of just a few examples.  You really need a pretty wide range
>> of test cases to be sure that you're not making things worse, any time
>> you're twiddling basic parameters like these.
>>
>>
> I will try more thing with this direction,  thanks for suggestion.
>

I choose TPC-H for this purpose and the data and index setup based on [1],
the attached normal.log is the plan without this patch, and patched.log is
the
plan with the patch.  In general,  the best path doesn't change due to this
patch,
All the plans whose cost changed has the following patten, which is
expected.

Index Scan ...
   Index Cond: ...
   Filter: ...

If you diff the two file, you may find the cost of "Index Scan" doesn't
change,
that is mainly because it only show 2 digits in cost, which is not accurate
enough
to show the difference.  However with a nest loop,  the overall plan shows
the cost
difference.

[1] https://ankane.org/tpc-h

-- 
Best Regards
Andy Fan


normal.log
Description: Binary data


patched.log
Description: Binary data


Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.

2020-05-29 Thread Andy Fan
On Fri, May 29, 2020 at 9:37 PM Ashutosh Bapat 
wrote:

> On Fri, May 29, 2020 at 6:40 AM Andy Fan  wrote:
> >
> >
> >>
> >> >so we need to optimize the cost model for such case, the method is the
> >> >patch I mentioned above.
> >>
> >> Making the planner more robust w.r.t. to estimation errors is nice, but
> >> I wouldn't go as far saying we should optimize for such cases. The stats
> >> can be arbitrarily off, so should we expect the error to be 10%, 100% or
> >> 100%?
> >
> >
> > I don't think my patch relay on anything like that.   My patch doesn't
> fix the
> > statistics issue,  just adding the extra cost on qual cost on Index
> Filter part.
> > Assume the query pattern are where col1= X and col2 = Y. The impacts are
> :
> > 1).  Make the cost of (col1, other_column) is higher than (col1, col2)
> > 2). The relationship between seqscan and index scan on index (col1,
> other_column)
> > is changed, (this is something I don't want).  However my cost
> difference between
> > index scan & seq scan usually very huge, so the change above should has
> > nearly no impact on that choice.   3). Make the cost higher index scan
> for
> > Index (col1) only.  Overall I think nothing will make thing worse.
>
> When the statistics is almost correct (or better than what you have in
> your example), the index which does not cover all the columns in all
> the conditions will be expensive anyways because of extra cost to
> access heap for the extra rows not filtered by that index. An index
> covering all the conditions would have its scan cost cheaper since
> there will be fewer rows and hence fewer heap page accesses because of
> more filtering. So I don't think we need any change in the current

costing model.
>

Thank you for your reply.  Looks you comments is based on the statistics
is almost correct (or better than what I have in my example),  That is
true.
However my goal is to figure out a way which can generate better plan even
the statistics is not correct (the statistics with such issue is not very
uncommon,
I just run into one such case and spend 1 week to handle some
non-technology
stuff after that).   I think the current issue is even my patch can make
the worst case
better, we need to make sure the average performance not worse.

-- 
Best Regards
Andy Fan


Re: pg_dump fail to not dump public schema orders

2020-05-29 Thread David G. Johnston
On Friday, May 29, 2020, Adrien Nayrat  wrote:

> Hello,
>
> I noticed pg_dump failed to not dump creation or comment commands for
> public
> schema when we explicitly ask it to dump public schema.
>
> Shorter example: pg_dump -n public dump will give:



> [Create schema public]
>

As far as I can tell this is working as intended/documented.  The public
schema doesn’t and doesn’t and shouldn’t get special treatment relative to
any other user schema here.

David J.


Re: Speeding up parts of the planner using a binary search tree structure for nodes

2020-05-29 Thread Ashutosh Bapat
On Fri, May 29, 2020 at 10:47 AM David Rowley  wrote:
>
> Hi,
>
> In [1] I mentioned that I'd like to look into coding a data structure
> to allow Node types to be looked up more efficiently than what List
> allows.  There are many places in the planner where we must determine
> if a given Node is in some List of Nodes. This is an O(n) operation.
> We could reduce these to as low as O(log n) with a binary search tree.
>
> A few places in the code that does this which can become a problem
> when the lists being searched are large:
>
> get_eclass_for_sort_expr()

eclass members have relids as their key. Can we use hash table instead
like how we manage join relations? I know that there are other cases
where we search for subset of relids instead of exact match. So the
hash table has to account for that. If we could somehow create a hash
value out of an expression node (we almost hash anything so that
should be possible) we should be able to use hash table for searching
expression as well.

> tlist_member()

hash table by expression as key here as well?

>
> For the former, it's quite easy to see this come up in the profiles by
> querying a partitioned table with many partitions. e.g
>
> create table listp (a int) partition by list(a);
> select 'create table listp'||x::Text || ' partition of listp for
> values in('||x::text||');' from generate_Series(1,1)x;
> \gexec
> create index on listp(a);
> explain select * from listp order by a;
>
> There are a few places that this new data structure could be used. My
> primary interest at the moment is EquivalenceClass.ec_members. There
> are perhaps other interesting places, such as planner targetlists, but
> obviously, any changes would need to be proved worthwhile before
> they're made.
>
> Per what I mentioned in [1], I'd like this structure to be a binary
> search tree, (either a red/black or AVL binary search tree compacted
> into an array.) So instead of having ->left ->right pointers, we have
> left and right offsets back into the array.  This allows very fast and
> cache-friendly unordered traversals of the tree, as it's an array
> whenever we want it to be and a tree when we need to perform a lookup
> of a specific value. Because I'm not using a true tree, i.e pointers
> to elements, then it does not appear like I can use rbtree.c
>
> The first step to make this work is to modify equalfuncs.c to add an
> equalCompare() function which will return an int to indicate the sort
> order of the node against another node. I've drafted a patch to
> implement this and attached it here. I've done this in 2 phases, 0001
> creates and implements datatype specific macros for the comparisons.
> Doing this allows us to optimise the bool/char field comparison macros
> in 0002 so we can do a simple subtransaction rather than 2
> comparisons. The 0002 patch modifies each comparison macro and changes
> all the equal functions to return int rather than bool.  The external
> comparison function is equalCompare(). equal() just calls
> equalCompare() and checks the value returned is 0.
>
> With both patches, there is an increase in size of about 17% for the
> object file for equalfuncs:
>
> equalfuncs.o
> Master: 56768 bytes
> Patched: 66912 bytes
>
> If I don't use the macros specifically optimised for bool and char,
> then the size is 68240 bytes. So I think doing 0001 is worth it.
>
> This does break the API for ExtensibleNodeMethods as it's no longer
> enough to just have nodeEqual(). In the 0002 patch I've changed this
> to nodeCompare(). Extension authors who are using this will need to
> alter their code to implement a proper comparison function.
>
> For now, I'm mostly just interested in getting feedback about making
> this change to equalfuncs.c.  Does anyone object to it?
>
> David
>
> [1] 
> https://www.postgresql.org/message-id/CAKJS1f8v-fUG8YpaAGj309ZuALo3aEk7f6cqMHr_AVwz1fKXug%40mail.gmail.com



-- 
Best Wishes,
Ashutosh Bapat




Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.

2020-05-29 Thread Ashutosh Bapat
On Fri, May 29, 2020 at 6:40 AM Andy Fan  wrote:
>
>
>>
>> >so we need to optimize the cost model for such case, the method is the
>> >patch I mentioned above.
>>
>> Making the planner more robust w.r.t. to estimation errors is nice, but
>> I wouldn't go as far saying we should optimize for such cases. The stats
>> can be arbitrarily off, so should we expect the error to be 10%, 100% or
>> 100%?
>
>
> I don't think my patch relay on anything like that.   My patch doesn't fix the
> statistics issue,  just adding the extra cost on qual cost on Index Filter 
> part.
> Assume the query pattern are where col1= X and col2 = Y. The impacts are :
> 1).  Make the cost of (col1, other_column) is higher than (col1, col2)
> 2). The relationship between seqscan and index scan on index (col1, 
> other_column)
> is changed, (this is something I don't want).  However my cost difference 
> between
> index scan & seq scan usually very huge, so the change above should has
> nearly no impact on that choice.   3). Make the cost higher index scan for
> Index (col1) only.  Overall I think nothing will make thing worse.

When the statistics is almost correct (or better than what you have in
your example), the index which does not cover all the columns in all
the conditions will be expensive anyways because of extra cost to
access heap for the extra rows not filtered by that index. An index
covering all the conditions would have its scan cost cheaper since
there will be fewer rows and hence fewer heap page accesses because of
more filtering. So I don't think we need any change in the current
costing model.

>
>>
>> We'd probably end up with plans that handle worst cases well,
>> but the average performance would end up being way worse :-(
>>
>
> That's possible,  that's why I hope to get some feedback on that.  Actually I
> can't think out such case.   can you have anything like that in mind?
>
> 
> I'm feeling that (qpqual_cost.per_tuple * 1.001) is not good enough since user
> may have some where expensive_func(col1) = X.   we may change it
> cpu_tuple_cost + qpqual_cost.per_tuple  + (0.0001) * list_lenght(qpquals).
>
> --
> Best Regards
> Andy Fan



-- 
Best Wishes,
Ashutosh Bapat




Re: password_encryption default

2020-05-29 Thread Tom Lane
Stephen Frost  writes:
> * Jonathan S. Katz (jk...@postgresql.org) wrote:
>> By that logic, I would +1 removing ENCRYPTED & UNENCRYPTED, given
>> ENCRYPTED effectively has no meaning either after all this time too.
>> Perhaps a stepping stone is to emit a deprecation warning on PG14 and
>> remove in PG15, but I think it's safe to remove.

> We're terrible about that, and people reasonably complain about such
> things because we don't *know* we're gonna remove it in 15.

If we're changing associated defaults, there's already some risk of
breaking badly-written applications.  +1 for just removing these
keywords in v14.

regards, tom lane




Re: password_encryption default

2020-05-29 Thread Jonathan S. Katz
On 5/29/20 9:22 AM, Stephen Frost wrote:
> Greetings,
> 
> * Jonathan S. Katz (jk...@postgresql.org) wrote:
>> On 5/29/20 3:33 AM, Michael Paquier wrote:
>>> On Thu, May 28, 2020 at 02:53:17PM +0200, Peter Eisentraut wrote:
 More along these lines: We could also remove the ENCRYPTED and UNENCRYPTED
 keywords from CREATE and ALTER ROLE.  AFAICT, these have never been emitted
 by pg_dump or psql, so there are no concerns from that end.  Thoughts?
>>>
>>> +0.5.  I think that you have a good point about the removal of
>>> UNENCRYPTED (one keyword gone!) as we don't support it since 10.  For
>>> ENCRYPTED, I'd rather keep it around for compatibility reasons for a
>>> longer time, just to be on the safe side.
>>
>> By that logic, I would +1 removing ENCRYPTED & UNENCRYPTED, given
>> ENCRYPTED effectively has no meaning either after all this time too. If
>> it's not emitted by any of our scripts, and it's been effectively moot
>> for 4 years (by the time of PG14), and we've been saying in the docs "he
>> ENCRYPTED keyword has no effect, but is accepted for backwards
>> compatibility" I think we'd be safe with removing it.
>>
>> Perhaps a stepping stone is to emit a deprecation warning on PG14 and
>> remove in PG15, but I think it's safe to remove.
> 
> We're terrible about that, and people reasonably complain about such
> things because we don't *know* we're gonna remove it in 15.
> 
> I'll argue again for the approach I mentioned before somewhere: when we
> commit the patch for 14, we go back and update the older docs to note
> that it's gone as of v14.  Deprecation notices and other such don't work
> and we instead end up carrying legacy things on forever.

Yeah, my first preference is to just remove it. I'm ambivalent towards
updating the older docs, but I do think it would be helpful.

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: password_encryption default

2020-05-29 Thread Stephen Frost
Greetings,

* Jonathan S. Katz (jk...@postgresql.org) wrote:
> On 5/29/20 3:33 AM, Michael Paquier wrote:
> > On Thu, May 28, 2020 at 02:53:17PM +0200, Peter Eisentraut wrote:
> >> More along these lines: We could also remove the ENCRYPTED and UNENCRYPTED
> >> keywords from CREATE and ALTER ROLE.  AFAICT, these have never been emitted
> >> by pg_dump or psql, so there are no concerns from that end.  Thoughts?
> > 
> > +0.5.  I think that you have a good point about the removal of
> > UNENCRYPTED (one keyword gone!) as we don't support it since 10.  For
> > ENCRYPTED, I'd rather keep it around for compatibility reasons for a
> > longer time, just to be on the safe side.
> 
> By that logic, I would +1 removing ENCRYPTED & UNENCRYPTED, given
> ENCRYPTED effectively has no meaning either after all this time too. If
> it's not emitted by any of our scripts, and it's been effectively moot
> for 4 years (by the time of PG14), and we've been saying in the docs "he
> ENCRYPTED keyword has no effect, but is accepted for backwards
> compatibility" I think we'd be safe with removing it.
> 
> Perhaps a stepping stone is to emit a deprecation warning on PG14 and
> remove in PG15, but I think it's safe to remove.

We're terrible about that, and people reasonably complain about such
things because we don't *know* we're gonna remove it in 15.

I'll argue again for the approach I mentioned before somewhere: when we
commit the patch for 14, we go back and update the older docs to note
that it's gone as of v14.  Deprecation notices and other such don't work
and we instead end up carrying legacy things on forever.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Sergei Kornilov
Hi

> Sergei, a few questions:
>
> - Just to be clear, your recipe does not require any indexes, right? Because 
> the constraint check table scan is inherently concurrent?

Right. "alter table validate constraint" can not use indexes, but does not 
block concurrent read/write queries. Other queries in this scenario can not use 
indexes too, but should be fast.

> - Was this new behavior mentioned in the release nose?

Yes, small note in documentation and small note in release notes 
https://www.postgresql.org/docs/12/release-12.html this one:

> Allow ALTER TABLE ... SET NOT NULL to avoid unnecessary table scans (Sergei 
> Kornilov)
> This can be optimized when the table's column constraints can be recognized 
> as disallowing nulls.

> - Do you know if there are any blog posts etc. discussing this? (I'm 
> definitely going to write one!!)

I do not know. Personally, I mentioned this feature in only a few 
Russian-speaking communities. And right now I wrote answer in dba.SO.

regards, Sergei




Re: password_encryption default

2020-05-29 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Thu, May 28, 2020 at 02:53:17PM +0200, Peter Eisentraut wrote:
> > More along these lines: We could also remove the ENCRYPTED and UNENCRYPTED
> > keywords from CREATE and ALTER ROLE.  AFAICT, these have never been emitted
> > by pg_dump or psql, so there are no concerns from that end.  Thoughts?
> 
> +0.5.  I think that you have a good point about the removal of
> UNENCRYPTED (one keyword gone!) as we don't support it since 10.  For
> ENCRYPTED, I'd rather keep it around for compatibility reasons for a
> longer time, just to be on the safe side.

It's both inaccurate and would be completely legacy at that point.

I disagree entirely about keeping it around 'for compatibility'.

Thanks,

Stephen


signature.asc
Description: PGP signature


pg_dump fail to not dump public schema orders

2020-05-29 Thread Adrien Nayrat
Hello,

I noticed pg_dump failed to not dump creation or comment commands for public
schema when we explicitly ask it to dump public schema.

Shorter example: pg_dump -n public dump will give:

--
-- Name: public; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA public;


ALTER SCHEMA public OWNER TO postgres;

--
-- Name: SCHEMA public; Type: COMMENT; Schema: -; Owner: postgres
--

COMMENT ON SCHEMA public IS 'standard public schema';


Obviously, it trigger errors when we try to restore it as public schema already
exists.


Git bisect blame this commit (since pg11):

commit 5955d934194c3888f30318209ade71b53d29777f (refs/bisect/bad)
Author: Tom Lane 
Date:   Thu Jan 25 13:54:42 2018 -0500
Improve pg_dump's handling of "special" built-in objects.

I first tried to add an only_dump_public_schema test. I am not used to how
pg_dump tests works but I do not think it is the best approach due to how many
test I had to disable for only_dump_public_schema.

Then I tried to change selectDumpableNamespace in order to apply the same
treatment to public schema when we explicitly ask pg_dump to dump public schema.

Unfortunately this broke other tests, all related to how we handle COLLATION.
For example:

#   Failed test 'only_dump_test_schema: should not dump ALTER COLLATION test0
OWNER TO'

#   Failed test 'only_dump_test_schema: should not dump COMMENT ON COLLATION 
test0'

#   Failed test 'only_dump_test_schema: should not dump CREATE COLLATION test0
FROM "C"'

#   Failed test 'only_dump_test_schema: should not dump REVOKE CREATE ON SCHEMA
public FROM public'


Regards,
commit ea7c208e5ca1aedc20a49eb800ae60d4292086bf
Author: Adrien Nayrat 
Date:   Fri May 29 14:57:41 2020 +0200

Add pg_dump test for only dump public schema

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index e116235769..cf0468a489 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -258,6 +258,13 @@ my %pgdump_runs = (
 			'--schema=dump_test', 'postgres',
 		],
 	},
+	only_dump_public_schema => {
+dump_cmd => [
+	'pg_dump', '--no-sync',
+	"--file=$tempdir/only_dump_public_schema.sql",
+	'--schema=public', 'postgres',
+],
+	},
 	only_dump_test_table => {
 		dump_cmd => [
 			'pg_dump',
@@ -399,6 +406,7 @@ my %full_runs = (
 	no_privs => 1,
 	pg_dumpall_dbprivs   => 1,
 	pg_dumpall_exclude   => 1,
+	only_dump_public_schema  => 1,
 	schema_only  => 1,);
 
 # This is where the actual tests are defined.
@@ -417,6 +425,7 @@ my %tests = (
 		  { %full_runs, %dump_test_schema_runs, section_post_data => 1, },
 		unlike => {
 			exclude_dump_test_schema => 1,
+			only_dump_public_schema  => 1,
 			no_privs => 1,
 		},
 	},
@@ -432,7 +441,10 @@ my %tests = (
 			\QREVOKE ALL ON FUNCTIONS  FROM PUBLIC;\E
 			/xm,
 		like => { %full_runs, section_post_data => 1, },
-		unlike => { no_privs => 1, },
+		unlike => {
+			only_dump_public_schema  => 1,
+			no_privs => 1,
+		},
 	},
 
 	'ALTER DEFAULT PRIVILEGES FOR ROLE regress_dump_test_role REVOKE SELECT'
@@ -450,7 +462,10 @@ my %tests = (
 			\QGRANT INSERT,REFERENCES,DELETE,TRIGGER,TRUNCATE,UPDATE ON TABLES  TO regress_dump_test_role;\E
 			/xm,
 		like => { %full_runs, section_post_data => 1, },
-		unlike => { no_privs => 1, },
+		unlike => {
+			no_privs => 1,
+			only_dump_public_schema  => 1,
+	   	},
 	  },
 
 	'ALTER ROLE regress_dump_test_role' => {
@@ -477,13 +492,19 @@ my %tests = (
 	'ALTER FOREIGN DATA WRAPPER dummy OWNER TO' => {
 		regexp => qr/^ALTER FOREIGN DATA WRAPPER dummy OWNER TO .+;/m,
 		like   => { %full_runs, section_pre_data => 1, },
-		unlike => { no_owner => 1, },
+		unlike => {
+			no_owner => 1,
+			only_dump_public_schema  => 1,
+		},
 	},
 
 	'ALTER SERVER s1 OWNER TO' => {
 		regexp => qr/^ALTER SERVER s1 OWNER TO .+;/m,
 		like   => { %full_runs, section_pre_data => 1, },
-		unlike => { no_owner => 1, },
+		unlike => {
+			no_owner => 1,
+			only_dump_public_schema  => 1,
+		},
 	},
 
 	'ALTER FUNCTION dump_test.pltestlang_call_handler() OWNER TO' => {
@@ -496,6 +517,7 @@ my %tests = (
 		unlike => {
 			exclude_dump_test_schema => 1,
 			no_owner => 1,
+			only_dump_public_schema  => 1,
 		},
 	},
 
@@ -509,6 +531,7 @@ my %tests = (
 		unlike => {
 			exclude_dump_test_schema => 1,
 			no_owner => 1,
+			only_dump_public_schema  => 1,
 		},
 	},
 
@@ -537,7 +560,10 @@ my %tests = (
 			/xm,
 		like =>
 		  { %full_runs, %dump_test_schema_runs, section_pre_data => 1, },
-		unlike => { exclude_dump_test_schema => 1, },
+		unlike => {
+			exclude_dump_test_schema => 1,
+			only_dump_public_schema  => 1,
+	   	},
 	},
 
 	'ALTER OPERATOR CLASS dump_test.op_class OWNER TO' => {
@@ -549,6 +575,7 @@ my %tests = (
 		  { %full_runs, %dump_test_schema_runs, section_pre_data => 1, },
 		unlike => {
 			exclude_dump_test_schema => 1,
+			only_dump_public_schema  => 1,
 			no_owner => 

Re: password_encryption default

2020-05-29 Thread Jonathan S. Katz
On 5/29/20 3:33 AM, Michael Paquier wrote:
> On Thu, May 28, 2020 at 02:53:17PM +0200, Peter Eisentraut wrote:
>> More along these lines: We could also remove the ENCRYPTED and UNENCRYPTED
>> keywords from CREATE and ALTER ROLE.  AFAICT, these have never been emitted
>> by pg_dump or psql, so there are no concerns from that end.  Thoughts?
> 
> +0.5.  I think that you have a good point about the removal of
> UNENCRYPTED (one keyword gone!) as we don't support it since 10.  For
> ENCRYPTED, I'd rather keep it around for compatibility reasons for a
> longer time, just to be on the safe side.

By that logic, I would +1 removing ENCRYPTED & UNENCRYPTED, given
ENCRYPTED effectively has no meaning either after all this time too. If
it's not emitted by any of our scripts, and it's been effectively moot
for 4 years (by the time of PG14), and we've been saying in the docs "he
ENCRYPTED keyword has no effect, but is accepted for backwards
compatibility" I think we'd be safe with removing it.

Perhaps a stepping stone is to emit a deprecation warning on PG14 and
remove in PG15, but I think it's safe to remove.

Perhaps stating the obvious here, but I also think it's a separate patch
from the $SUBJECT, but glad to see the clean up :)

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: Trouble with hashagg spill I/O pattern and costing

2020-05-29 Thread Tomas Vondra

On Thu, May 28, 2020 at 06:14:55PM -0700, Jeff Davis wrote:

On Thu, 2020-05-28 at 20:57 +0200, Tomas Vondra wrote:

Attached is a patch adding CP_SMALL_TLIST to create_agg_plan and
create_groupingsets_plan.


Looks good, except one question:

Why would aggstrategy ever be MIXED when in create_agg_path? Wouldn't
that only happen in create_groupingsets_path?



Ah, right. Yeah, we only need to check for AGG_HASH here. Moreover,
AGG_MIXED probably does not need the tlist tweak, because the input
should be pre-sorted as with AGG_SORTED.

And we should probably do similar check in the create_groupinsets_path,
I guess. At first I thought we can't do that before inspecting rollups,
which only happens later in the function, but now I see there's
aggstrategy in GroupingSetsPath too.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/optimizer/plan/createplan.c 
b/src/backend/optimizer/plan/createplan.c
index 9941dfe65e..eab57fd74b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -2113,12 +2113,21 @@ create_agg_plan(PlannerInfo *root, AggPath *best_path)
Plan   *subplan;
List   *tlist;
List   *quals;
+   int flags;
 
/*
 * Agg can project, so no need to be terribly picky about child tlist, 
but
-* we do need grouping columns to be available
+* we do need grouping columns to be available. We are a bit more 
careful
+* with hash aggregate, where we explicitly request small tlist to 
minimize
+* I/O needed for spilling (possibly not known already).
 */
-   subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
+   flags = CP_LABEL_TLIST;
+
+   /* ensure small tlist for hash aggregate */
+   if (best_path->aggstrategy == AGG_HASHED)
+   flags |= CP_SMALL_TLIST;
+
+   subplan = create_plan_recurse(root, best_path->subpath, flags);
 
tlist = build_path_tlist(root, _path->path);
 
@@ -2200,6 +2209,7 @@ create_groupingsets_plan(PlannerInfo *root, 
GroupingSetsPath *best_path)
int maxref;
List   *chain;
ListCell   *lc;
+   int flags;
 
/* Shouldn't get here without grouping sets */
Assert(root->parse->groupingSets);
@@ -2207,9 +2217,17 @@ create_groupingsets_plan(PlannerInfo *root, 
GroupingSetsPath *best_path)
 
/*
 * Agg can project, so no need to be terribly picky about child tlist, 
but
-* we do need grouping columns to be available
+* we do need grouping columns to be available. We are a bit more 
careful
+* with hash aggregate, where we explicitly request small tlist to 
minimize
+* I/O needed for spilling (possibly not known already).
 */
-   subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
+   flags = CP_LABEL_TLIST;
+
+   /* ensure small tlist for hash aggregate */
+   if (best_path->aggstrategy == AGG_HASHED)
+   flags |= CP_SMALL_TLIST;
+
+   subplan = create_plan_recurse(root, best_path->subpath, flags);
 
/*
 * Compute the mapping from tleSortGroupRef to column index in the 
child's


Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Komяpa
>
>
>
> John, I think it's worth pointing out that Postgres most likely does a
> full table scan to validate a constraint by design and not in optimization
> oversight.  Think of what's gonna happen if the index used for checking is
> corrupted?
>
>
This can't be true: a corrupted index is a failure mode, and failure modes
are not expected in normal flow. Think of it this way: we must never use
index scan, because if index is corrupted the results are going to be
disastrous, so we will always do Seq Scans.

It's ok to assume index is not corrupted.



-- 
Darafei Praliaskouski
Support me: http://patreon.com/komzpa


Re: OpenSSL 3.0.0 compatibility

2020-05-29 Thread Daniel Gustafsson
> On 29 May 2020, at 13:34, Peter Eisentraut  
> wrote:
> 
> On 2020-05-29 00:16, Daniel Gustafsson wrote:
>> Regarding the deprecations, we can either set preprocessor directives or use
>> compiler flags to silence the warning and do nothing (for now), or we could
>> update to the new API.  We probably want to different things for master vs
>> back-branches, but as an illustration of what the latter could look like I've
>> implemented this in 0001.
> 
> I think we should set OPENSSL_API_COMPAT=10001, and move that along with 
> whatever our oldest supported release is going forward.  That declares our 
> intention, it will silence the deprecation warnings, and IIUC, if the 
> deprecated stuff actually gets removed, you get a clean compiler error that 
> your API level is too low.

I think I know what you mean but just to clarify: I master, back-branches or
all of the above?

Considering how little effort it is to not use the deprecated API's I'm not
entirely convinced, but I don't have too strong opinions there.

>> OpenSSL also deprecates DES keys in 3.0.0, which cause our password callback
>> tests to fail with the cryptic error "fetch failed", as the test suite keys 
>> are
>> encrypted with DES.  0002 fixes this by changing to AES256 (randomly chosen
>> among the ciphers supported in 1.0.1+ and likely to be around), and could be
>> applied already today as there is nothing 3.0.0 specific about it.
> 
> It appears you can load a "legacy provider" to support these keys.  What if 
> someone made a key using 0.9.* with an older PostgreSQL version and keeps 
> using the same key?  I'm wondering about the implications in practice here.  
> Obviously moving off legacy crypto is good in general.

If they do, then that key will stop working with any OpenSSL 3 enabled software
unless the legacy provider has been loaded.  My understanding is that users can
load these in openssl.conf, so maybe it's mostly a documentation patch for us?

Iff key loading fails with an errormessage that indicates that the algorithm
couldn't be fetched (ie fetch failed), we could add an errhint on algorithm
use.  Currently it's easy to believe that it's the key file that couldn't be
fetched, as the error message from OpenSSL is quite cryptic and expects the
user to understand OpenSSL terminology.  This could happen already in pre-3
versions, so maybe it's worthwhile to do?

> There is also the question of what to do with the test suites in the back 
> branches.

If we don't want to change the testdata in the backbranches, we could add a
SKIP section for the password key tests iff OpenSSL is 3.0.0+?

> Maybe we will want some user-exposed option about which providers to load, 
> since that also affects whether the FIPS provider gets loaded. What's the 
> plan there?

This again boils down to if we want to load providers, or if we want to punt
that to openssl.conf completely.  Since we will support olders versions for a
long time still, maybe punting will render the cleanest code?

AFAICT, if care is taken with openssl.conf one could already load providers
such that postgres will operate in FIPS mode due to nothing non-FIPS being
available.  For libpq I'm not sure if there is anything to do, as we don't
mandate any cipher use (SSL_CTX_set_cipher_list will simply fail IIUC).  For
pgcrypto however it's another story, but there we'd need to rewrite it to not
use low-level APIs first since the use of those aren't FIPS compliant.  Once
done, we could have an option for FIPS mode and pass the "fips=yes" property
when loading ciphers to make sure the right version is loaded if multiple ones
are available.

cheers ./daniel



Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread John Bachir
Wow! Thank you Sergei for working on this patch, for working for months/years 
to get it in, and for replying to my email!

For others reading this later:
- the feature was introduced in 12
- the commit is here 
https://github.com/postgres/postgres/commit/bbb96c3704c041d139181c6601e5bc770e045d26

Sergei, a few questions:

- Just to be clear, your recipe does not require any indexes, right? Because 
the constraint check table scan is inherently concurrent?
- Was this new behavior mentioned in the release nose?
- Do you know if there are any blog posts etc. discussing this? (I'm definitely 
going to write one!!)

John


> 
> But the answer in SO is a bit incomplete for recent postgresql 
> releases. Seqscan is not the only possible way to set not null in 
> pg12+. My patch was commited ( 
> https://commitfest.postgresql.org/22/1389/ ) and now it's possible to 
> do this way:
> 
> alter table foos 
>  add constraint foos_not_null 
>  check (bar1 is not null) not valid; -- short-time exclusive lock
> 
> alter table foos validate constraint foos_not_null; -- still seqscan 
> entire table but without exclusive lock
> 
> An then another short lock:
> alter table foos alter column bar1 set not null;
> alter table foos drop constraint foos_not_null;
> 
> regards, Sergei
>




Re: OpenSSL 3.0.0 compatibility

2020-05-29 Thread Peter Eisentraut

On 2020-05-29 00:16, Daniel Gustafsson wrote:

Regarding the deprecations, we can either set preprocessor directives or use
compiler flags to silence the warning and do nothing (for now), or we could
update to the new API.  We probably want to different things for master vs
back-branches, but as an illustration of what the latter could look like I've
implemented this in 0001.


I think we should set OPENSSL_API_COMPAT=10001, and move that along with 
whatever our oldest supported release is going forward.  That declares 
our intention, it will silence the deprecation warnings, and IIUC, if 
the deprecated stuff actually gets removed, you get a clean compiler 
error that your API level is too low.



OpenSSL also deprecates DES keys in 3.0.0, which cause our password callback
tests to fail with the cryptic error "fetch failed", as the test suite keys are
encrypted with DES.  0002 fixes this by changing to AES256 (randomly chosen
among the ciphers supported in 1.0.1+ and likely to be around), and could be
applied already today as there is nothing 3.0.0 specific about it.


It appears you can load a "legacy provider" to support these keys.  What 
if someone made a key using 0.9.* with an older PostgreSQL version and 
keeps using the same key?  I'm wondering about the implications in 
practice here.  Obviously moving off legacy crypto is good in general.


There is also the question of what to do with the test suites in the 
back branches.


Maybe we will want some user-exposed option about which providers to 
load, since that also affects whether the FIPS provider gets loaded. 
What's the plan there?


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-05-29 Thread Amit Kapila
On Wed, May 27, 2020 at 5:19 PM Mahendra Singh Thalor 
wrote:

> On Tue, 26 May 2020 at 16:46, Amit Kapila  wrote:
>
> Hi all,
> On the top of v16 patch set [1], I did some testing for DDL's and DML's to
> test wal size and performance. Below is the testing summary;
>
> *Test parameters:*
> wal_level= 'logical
> max_connections = '150'
> wal_receiver_timeout = '600s'
> max_wal_size = '2GB'
> min_wal_size = '2GB'
> autovacuum= 'off'
> checkpoint_timeout= '1d'
>
> *Test results:*
>
> CREATE index operations Add col int(date) operations Add col text
> operations
> SN. operation name LSN diff (in bytes) time (in sec) % LSN change LSN
> diff (in bytes) time (in sec) % LSN change LSN diff (in bytes) time (in
> sec) % LSN change
>
> 1
> 1 DDL without patch 17728 0.89116
> 1.624548
> 976 0.764393
> 11.475409
> 33904 0.80044
> 2.80792
> with patch 18016 0.804868 1088 0.763602 34856 0.787108
>
> 2
> 2 DDL without patch 19872 0.860348
> 2.73752
> 1632 0.763199
> 13.7254902
> 34560 0.806086
> 3.078703
> with patch 20416 0.839065 1856 0.733147 35624 0.829281
>
> 3
> 3 DDL without patch 22016 0.894891
> 3.63372093
> 2288 0.776871
> 14.685314
> 35216 0.803493
> 3.339391186
> with patch 22816 0.828028 2624 0.737177 36392 0.800194
>
> 4
> 4 DDL without patch 24160 0.901686
> 4.4701986
> 2944 0.768445
> 15.217391
> 35872 0.77489
> 3.590544
> with patch 25240 0.887143 3392 0.768382 37160 0.82777
>
> 5
> 5 DDL without patch 26328 0.901686
> 4.9832877
> 3600 0.751879
> 15.55
> 36528 0.817928
> 3.832676
> with patch 27640 0.914078 4160 0.74709 37928 0.820621
>
> 6
> 6 DDL without patch 28472 0.936385
> 5.5071649
> 4256 0.745179
> 15.78947368
> 37184 0.797043
> 4.066265
> with patch 30040 0.958226 4928 0.725321 38696 0.814535
>
> 7
> 8 DDL without patch 32760 1.0022203
> 6.422466
> 5568 0.757468
> 16.091954
> 38496 0.83207
> 4.509559
> with patch 34864 0.966777 6464 0.769072 40232 0.903604
>
> 8
> 11 DDL without patch 50296 1.0022203
> 5.662478
> 7536 0.748332
> 16.66
> 40464 0.822266
> 5.179913
> with patch 53144 0.966777 8792 0.750553 42560 0.797133
>
> 9
> 15 DDL without patch <#m_-5189706345613774249_gid=2095312519=B9>
> 58896 1.267253
> 5.662478
> 10184 0.776875
> 16.496465
> 43112 0.821916
> 5.84524
> with patch 62768 1.27234 11864 0.746844 45632 0.812567
>
> 10
> 1 DDL & 3 DML without patch 18240 0.812551
> 1.6228
> 1192 0.771993
> 10.067114
> 34120 0.849467
> 2.8113599
> with patch 18536 0.819089 1312 0.785117 35080 0.855456
>
> 11
> 3 DDL & 5 DML without patch 23656 0.926616
> 3.4832606
> 2656 0.758029
> 13.55421687
> 35584 0.829377
> 3.372302
> with patch 24480 0.915517 3016 0.797206 36784 0.839176
>
> 12
> 10 DDL & 5 DML without patch 52760 1.101005
> 4.958301744
> 7288 0.763065
> 16.02634468
> 40216 0.837843
> 4.993037
> with patch 55376 1.105241 8456 0.779257 42224 0.835206
>
> 13
> 10 DML without patch 1008 0.791091
> 6.349206
> 1008 0.81105
> 6.349206
> 1008 0.78817
> 6.349206
> with patch 1072 0.807875 1072 0.771113 1072 0.759789
>
> To see all operations, please see[2] test_results
> 
>
>
Why are you seeing any additional WAL in case-13 (10 DML) where there is no
DDL?  I think it is because you have used savepoints in that case which
will add some additional WAL.  You seems to have 9 savepoints in that test
which should ideally generate 36 bytes of additional WAL (4-byte per
transaction id for each subtransaction).  Also, in other cases where you
took data for DDL and DML, you have also used savepoints in those tests. I
suggest for savepoints, let's do separate tests as you have done in case-13
but we can do it 3,5,7,10 savepoints and probably each transaction can
update a row of 200 bytes or so.

I think you can take data for somewhat more realistic cases of DDL and DML
combination like 3 DDL's with 10 DML and 3 DDL's with 15 DML operations.
In general, I think we will see many more DML's per DDL.  It is good to see
the worst-case WAL and performance overhead as you have done.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-05-29 Thread Amit Kapila
On Wed, May 27, 2020 at 8:22 PM Dilip Kumar  wrote:
>
>
> While reviewing/testing I have found a couple of problems in 0005 and
> 0006 which I have fixed in the attached version.
>
..
>
> In 0006: If we are streaming the serialized changed and there are
> still few incomplete changes, then currently we are not deleting the
> spilled file, but the spill file contains all the changes of the
> transaction because there is no way to partially truncate it.  So in
> the next stream, it will try to resend those.  I have fixed this by
> sending the spilled transaction as soon as its changes are complete so
> ideally, we can always delete the spilled file.  It is also a better
> solution because this transaction is already spilled once and that
> happened because we could not stream it,  so we better stream it on
> the first opportunity that will reduce the replay lag which is our
> whole purpose here.
>

I have reviewed these changes (in the patch
v25-0006-Bugfix-handling-of-incomplete-toast-spec-insert-) and below
are my comments.

1.
+ /*
+ * If the transaction is serialized and the the changes are complete in
+ * the top level transaction then immediately stream the transaction.
+ * The reason for not waiting for memory limit to get full is that in
+ * the streaming mode, if the transaction serialized that means we have
+ * already reached the memory limit but that time we could not stream
+ * this due to incomplete tuple so now stream it as soon as the tuple
+ * is complete.
+ */
+ if (rbtxn_is_serialized(txn))
+ ReorderBufferStreamTXN(rb, toptxn);

I think here it is important to explain why it is a must to stream a
prior serialized transaction as otherwise, later we won't be able to
know how to truncate a file.

2.
+ * If complete_truncate is set we completely truncate the transaction,
+ * otherwise we truncate upto last_complete_lsn if the transaction has
+ * incomplete changes.  Basically, complete_truncate is passed true only if
+ * concurrent abort is detected while processing the TXN.
  */
 static void
-ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
+ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
+ bool partial_truncate)
 {

The description talks about complete_truncate flag whereas API is
using partial_truncate flag.  I think the description needs to be
changed.

3.
+ /* We have truncated upto last complete lsn so stop. */
+ if (partial_truncate && rbtxn_has_incomplete_tuple(toptxn) &&
+ (change->lsn > toptxn->last_complete_lsn))
+ {
+ /*
+ * If this is a top transaction then we can reset the
+ * last_complete_lsn and complete_size, because by now we would
+ * have stream all the changes upto last_complete_lsn.
+ */
+ if (txn->toptxn == NULL)
+ {
+ toptxn->last_complete_lsn = InvalidXLogRecPtr;
+ toptxn->complete_size = 0;
+ }
+ break;
+ }

I think here we can add an Assert to ensure that we don't partially
truncate when the transaction is serialized and add comments for the
same.

4.
+ /*
+ * Subtract the processed changes from the nentries/nentries_mem Refer
+ * detailed comment atop this variable in ReorderBufferTXN structure.
+ * We do this only ff we are truncating the partial changes otherwise
+ * reset these values directly to 0.
+ */
+ if (partial_truncate)
+ {
+ txn->nentries -= txn->nprocessed;
+ txn->nentries_mem -= txn->nprocessed;
+ }
+ else
+ {
+ txn->nentries = 0;
+ txn->nentries_mem = 0;
+ }

I think we can write this comment as "Adjust nentries/nentries_mem
based on the changes processed.  See comments where nprocessed is
declared."

5.
+ /*
+ * In streaming mode, sometime we can't stream all the changes due to the
+ * incomplete changes.  So we can not directly reset the values of
+ * nentries/nentries_mem to 0 after one stream is sent like we do in
+ * non-streaming mode.  So while sending one stream we keep count of the
+ * changes processed in thi stream and only those many changes we decrement
+ * from the nentries/nentries_mem.
+ */
+ uint64 nprocessed;

How about something like: "Number of changes processed.  This is used
to keep track of changes that remained to be streamed.  As of now,
this can happen either due to toast tuples or speculative insertions
where we need to wait for multiple changes before we can send them."

6.
+ /* Size of the commplete changes. */
+ Size complete_size;

Typo. /commplete/complete

7.
+ /*
+ * Increment the nprocessed count.  See the detailed comment
+ * for usage of this in ReorderBufferTXN structure.
+ */
+ change->txn->nprocessed++;

Ideally, this has to be incremented after processing the change.  So,
we can combine it with existing check in the patch as below:

if (streaming)
{
   change->txn->nprocessed++;

  if (rbtxn_has_incomplete_tuple(txn) &&
prev_lsn == txn->last_complete_lsn)
{
/* Only in streaming mode we should get here. */
Assert(streaming);
partial_truncate = true;
break;
}
}

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762

2020-05-29 Thread Masahiko Sawada
On Fri, 29 May 2020 at 17:57, Kyotaro Horiguchi  wrote:
>
> At Fri, 29 May 2020 16:21:38 +0900, Michael Paquier  
> wrote in
> > On Thu, May 28, 2020 at 06:11:39PM +0900, Kyotaro Horiguchi wrote:
> > > Mmm. It is not the proper way to use physical replication and it's
> > > totally accidental that that worked (or even it might be a bug). The
> > > documentation is saying as the follows, as more-or-less the same for
> > > all versions since 9.4.
> > >
> > > https://www.postgresql.org/docs/13/protocol-replication.html
> >
> > +   if (am_db_walsender)
> > +   ereport(ERROR,
> > +   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > +errmsg("cannot initiate physical
> > replication on a logical replication connection")));
> >
> > I don't agree with this change.  The only restriction that we have in
> > place now in walsender.c regarding MyDatabaseId not being set is to
> > prevent the execution of SQL commands.  Note that it is possible to
> > start physical replication even if MyDatabaseId is set in a
> > replication connection, so you could break cases that have been valid
> > until now.
>
> It donesn't check MyDatabase, but whether the connection parameter
> "repliation" is "true" or "database".  The documentation is telling
> that "replication" should be "true" for a connection that is to be
> used for physical replication, and "replication" should literally be
> "database" for a connection that is for logical replication.  We need
> to revise the documentation if we are going to allow physical
> replication on a conection with "replication = database".
>

Yes. Conversely, if we start logical replication in a physical
replication connection (i.g. replication=true) we got an error before
staring replication:

ERROR:  logical decoding requires a database connection

I think we can prevent that SEGV in a similar way.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762

2020-05-29 Thread Kyotaro Horiguchi
At Fri, 29 May 2020 16:21:38 +0900, Michael Paquier  wrote 
in 
> On Thu, May 28, 2020 at 06:11:39PM +0900, Kyotaro Horiguchi wrote:
> > Mmm. It is not the proper way to use physical replication and it's
> > totally accidental that that worked (or even it might be a bug). The
> > documentation is saying as the follows, as more-or-less the same for
> > all versions since 9.4.
> > 
> > https://www.postgresql.org/docs/13/protocol-replication.html
> 
> +   if (am_db_walsender)
> +   ereport(ERROR,
> +   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +errmsg("cannot initiate physical
> replication on a logical replication connection")));
> 
> I don't agree with this change.  The only restriction that we have in
> place now in walsender.c regarding MyDatabaseId not being set is to
> prevent the execution of SQL commands.  Note that it is possible to
> start physical replication even if MyDatabaseId is set in a
> replication connection, so you could break cases that have been valid
> until now.

It donesn't check MyDatabase, but whether the connection parameter
"repliation" is "true" or "database".  The documentation is telling
that "replication" should be "true" for a connection that is to be
used for physical replication, and "replication" should literally be
"database" for a connection that is for logical replication.  We need
to revise the documentation if we are going to allow physical
replication on a conection with "replication = database".

> I think that we actually should be much more careful with the
> initialization of the WAL reader used in the context of a WAL sender
> before calling WALRead() and attempting to read a new WAL page.

I agree that the initialization can be improved, but the current code
is no problem if we don't allow to run both logical and physical
replication on a single session.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Does PG server process keep backend info

2020-05-29 Thread brajmohan saxena
Hi,

Is there any extension or option in PG to keep information of any (memory
context/some memory address) of backend process after getting killed in
(postgres server process/postmaster  process)

Thanks
Braj


RE: archive status ".ready" files may be created too early

2020-05-29 Thread matsumura....@fujitsu.com

2020-03-26 18:50:24 Bossart, Nathan  wrote:
> The v3 patch is a proof-of-concept patch that moves the ready-for-
> archive logic to the WAL writer process.  We mark files as ready-for-
> archive when the WAL flush pointer has advanced beyond a known WAL
> record boundary.


I like such a simple resolution, but I cannot agree it.

1.
This patch makes wal_writer_delay to have two meanings. For example,
an user setting the parameter to a bigger value gets a archived file
later.

2.
Even if we create a new parameter, we and users cannot determine the
best value.

3.
PostgreSQL guarantees that if a database cluster stopped smartly,
the cluster flushed and archived all WAL record as follows.

 [xlog.c]
  * If archiving is enabled, rotate the last XLOG file so that all the
  * remaining records are archived (postmaster wakes up the archiver
  * process one more time at the end of shutdown). The checkpoint
  * record will go to the next XLOG file and won't be archived (yet).

Therefore, the idea may need that end-synchronization between WalWriter
and archiver(pgarch).  I cannot agree it because processing for stopping
system has complexity inherently and the syncronization makes it more 
complicated.  Your idea gives up currency of the notifying instead of 
simplicity,
but I think that the synchronization may ruin its merit.

4.
I found the patch spills a chance for notifying.  We have to be more careful.
At the following case, WalWriter will notify after a little less than 3 times
of wal_writer_delay in worst case.  It may not be allowed depending on value
of wal_writer_delay. If we create a new parameter, we cannot explain to user 
about it.

Premise:
- Seg1 has been already notified.
- FlushedPtr is 0/2D0 (= all WAL record is flushed).

-
Step 1.
Backend-A updates InsertPtr to 0/2E0, but does not
copy WAL record to buffer.

Step 2. (sleep)
WalWriter memorize InsertPtr 0/2E0 to the local variable
(LocalInsertPtr) and sleep because FlushedPtr has not passed
InsertPtr.

Step 3.
Backend-A copies WAL record to buffer.

Step 4.
Backend-B process updates InsertPtr to 0/310,
copies their record to buffer, commits (flushes it by itself),
and updates FlushedPtr to 0/310.

Step 5.
WalWriter detects that FlushedPtr(0/310) passes
LocalInsertPtr(0/2E0), but WalWriter cannot notify Seg2
though it should be notified.

It is caused by that WalWriter does not know that
which record is crossing segment boundary.

Then, after two sleeping for cheking that InsertPtr passes
FlushedPtr again in worst case, Seg2 is notified.

Step 6. (sleep)
WalWriter sleep.

Step 7.
Backend-C inserts WAL record, flush, and updates as follows:
InsertPtr --> 0/320
FlushedPtr --> 0/320

Step 8.
Backend-D updates InsertPtr to 0/330, but does not copy
record to buffer.

Step 9. (sleep)
WalWriter memorize InsertPtr 0/330 to LocalInsertPtr
and sleep because FlushedPtr has been 0/320.

Step 10.
Backend-D copies its record.

Step 11.
Someone(Backend-X or WalWriter) flushes and updates FlushedPtr
to 0/330.

Step 12.
WalWriter detects that FlushedPtr(0/330) passes
LocalInsertPtr(0/330) and notify Seg2.
-


I'm preparing a patch that backend inserting segment-crossboundary
WAL record leaves its EndRecPtr and someone flushing it checks
the EndRecPtr and notifies..


Regards
Ryo Matsumura


Re: password_encryption default

2020-05-29 Thread Michael Paquier
On Thu, May 28, 2020 at 02:53:17PM +0200, Peter Eisentraut wrote:
> More along these lines: We could also remove the ENCRYPTED and UNENCRYPTED
> keywords from CREATE and ALTER ROLE.  AFAICT, these have never been emitted
> by pg_dump or psql, so there are no concerns from that end.  Thoughts?

+0.5.  I think that you have a good point about the removal of
UNENCRYPTED (one keyword gone!) as we don't support it since 10.  For
ENCRYPTED, I'd rather keep it around for compatibility reasons for a
longer time, just to be on the safe side.
--
Michael


signature.asc
Description: PGP signature


Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Oleksandr Shulgin
On Fri, May 29, 2020 at 8:56 AM Sergei Kornilov  wrote:

> Hello
>
> Correct index lookup is a difficult task. I tried to implement this
> previously...
>
> But the answer in SO is a bit incomplete for recent postgresql releases.
> Seqscan is not the only possible way to set not null in pg12+. My patch was
> commited ( https://commitfest.postgresql.org/22/1389/ ) and now it's
> possible to do this way:
>
> alter table foos
>  add constraint foos_not_null
>  check (bar1 is not null) not valid; -- short-time exclusive lock
>
> alter table foos validate constraint foos_not_null; -- still seqscan
> entire table but without exclusive lock
>
> An then another short lock:
> alter table foos alter column bar1 set not null;
> alter table foos drop constraint foos_not_null;
>

That's really good to know, Sergei!

John, I think it's worth pointing out that Postgres most likely does a full
table scan to validate a constraint by design and not in optimization
oversight.  Think of what's gonna happen if the index used for checking is
corrupted?

Cheers,
--
Alex


Re: race condition when writing pg_control

2020-05-29 Thread Fujii Masao




On 2020/05/27 16:10, Michael Paquier wrote:

On Tue, May 26, 2020 at 07:30:54PM +, Bossart, Nathan wrote:

While an assertion in UpdateControlFile() would not have helped us
catch the problem I initially reported, it does seem worthwhile to add
it.  I have attached a patch that adds this assertion and also
attempts to fix XLogReportParameters().  Since there is only one place
where we feel it is safe to call UpdateControlFile() without a lock, I
just changed it to take the lock.  I don't think this adds any sort of
significant contention risk, and IMO it is a bit cleaner than the
boolean flag.


Let's see what Fujii-san and Thomas think about that.  I'd rather
avoid taking a lock here because we don't need it and because it makes
things IMO confusing with the beginning of StartupXLOG() where a lot
of the fields are read, even if we go without this extra assertion.


I have no strong opinion about this, but I tend to agree with Michael here.
 

For the XLogReportParameters() fix, I simply added an exclusive lock
acquisition for the portion that updates the values in shared memory
and calls UpdateControlFile().  IIUC the first part of this function
that accesses several ControlFile values should be safe, as none of
them can be updated after server start.


They can get updated when replaying a XLOG_PARAMETER_CHANGE record.
But you are right as all of this happens in the startup process, so
your patch looks right to me here.


LGTM.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762

2020-05-29 Thread Michael Paquier
On Thu, May 28, 2020 at 06:11:39PM +0900, Kyotaro Horiguchi wrote:
> Mmm. It is not the proper way to use physical replication and it's
> totally accidental that that worked (or even it might be a bug). The
> documentation is saying as the follows, as more-or-less the same for
> all versions since 9.4.
> 
> https://www.postgresql.org/docs/13/protocol-replication.html

+   if (am_db_walsender)
+   ereport(ERROR,
+   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("cannot initiate physical
replication on a logical replication connection")));

I don't agree with this change.  The only restriction that we have in
place now in walsender.c regarding MyDatabaseId not being set is to
prevent the execution of SQL commands.  Note that it is possible to
start physical replication even if MyDatabaseId is set in a
replication connection, so you could break cases that have been valid
until now.

I think that we actually should be much more careful with the
initialization of the WAL reader used in the context of a WAL sender
before calling WALRead() and attempting to read a new WAL page.
--
Michael


signature.asc
Description: PGP signature


Re: OpenSSL 3.0.0 compatibility

2020-05-29 Thread Daniel Gustafsson
> On 29 May 2020, at 08:06, Laurenz Albe  wrote:

>> Regarding the deprecations, we can either set preprocessor directives or use
>> compiler flags to silence the warning and do nothing (for now), or we could
>> update to the new API.  We probably want to different things for master vs
>> back-branches, but as an illustration of what the latter could look like I've
>> implemented this in 0001.
> 
> An important question will be: if we convert to functions that are not 
> deprecated,
> what is the earliest OpenSSL version we can support?

The replacement functions for _locations calls are introduced together with the
deprecation in 3.0.0, so there is no overlap.

For pgcrypto, that remains to be seen once it attempted, but ideally all the
way down to 1.0.1.

cheers ./daniel



Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Sergei Kornilov
Hello

Correct index lookup is a difficult task. I tried to implement this 
previously...

But the answer in SO is a bit incomplete for recent postgresql releases. 
Seqscan is not the only possible way to set not null in pg12+. My patch was 
commited ( https://commitfest.postgresql.org/22/1389/ ) and now it's possible 
to do this way:

alter table foos 
 add constraint foos_not_null 
 check (bar1 is not null) not valid; -- short-time exclusive lock

alter table foos validate constraint foos_not_null; -- still seqscan entire 
table but without exclusive lock

An then another short lock:
alter table foos alter column bar1 set not null;
alter table foos drop constraint foos_not_null;

regards, Sergei




Re: segmentation fault using currtid and partitioned tables

2020-05-29 Thread Michael Paquier
On Thu, May 28, 2020 at 05:55:59PM -0700, Andres Freund wrote:
> And there only for very old servers (< 8.2), according to Hiroshi
> Inoue. Found that out post 12 freeze. I was planning to drop them for
> 13, but I unfortunately didn't get around to do so :(

[... digging ...]
Ah, I think I see your point from the code.  That's related to the use
of RETURNING for ctids.

> I guess we could decide to make a freeze exception to remove them now,
> although I'm not sure the reasons for doing so are strong enough.

Not sure that's a good thing to do after beta1 for 13, but there is an
argument for that in 14.  FWIW, my company is a huge user of the ODBC
driver (perhaps the biggest one?), and we have nothing even close to
8.2.

> I concur that it seems unnecessary to make these translatable, even with
> the reduced scope from
> https://www.postgresql.org/message-id/20200526025959.GE6155%40paquier.xyz

Okay, I have switched the patch to do that.  Any comments or
objections?
--
Michael
diff --git a/src/backend/utils/adt/tid.c b/src/backend/utils/adt/tid.c
index 4ce8375eab..ac232a238f 100644
--- a/src/backend/utils/adt/tid.c
+++ b/src/backend/utils/adt/tid.c
@@ -31,6 +31,7 @@
 #include "parser/parsetree.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
 #include "utils/varlena.h"
@@ -349,6 +350,12 @@ currtid_for_view(Relation viewrel, ItemPointer tid)
 	return (Datum) 0;
 }
 
+/*
+ *	currtid_byreloid
+ *		Return the latest visible tid of a relation's tuple, associated
+ *		to the tid given in input.  This is a wrapper for currtid(), and
+ *		uses in input the OID of the relation to look at.
+ */
 Datum
 currtid_byreloid(PG_FUNCTION_ARGS)
 {
@@ -378,6 +385,11 @@ currtid_byreloid(PG_FUNCTION_ARGS)
 	if (rel->rd_rel->relkind == RELKIND_VIEW)
 		return currtid_for_view(rel, tid);
 
+	if (!RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
+		elog(ERROR, "cannot look at latest visible tid for relation \"%s.%s\"",
+			 get_namespace_name(RelationGetNamespace(rel)),
+			 RelationGetRelationName(rel));
+
 	ItemPointerCopy(tid, result);
 
 	snapshot = RegisterSnapshot(GetLatestSnapshot());
@@ -391,6 +403,12 @@ currtid_byreloid(PG_FUNCTION_ARGS)
 	PG_RETURN_ITEMPOINTER(result);
 }
 
+/*
+ *	currtid_byrelname
+ *		Return the latest visible tid of a relation's tuple, associated
+ *		to the tid given in input.  This is a wrapper for currtid2(), and
+ *		uses in input the relation name to look at.
+ */
 Datum
 currtid_byrelname(PG_FUNCTION_ARGS)
 {
@@ -415,6 +433,11 @@ currtid_byrelname(PG_FUNCTION_ARGS)
 	if (rel->rd_rel->relkind == RELKIND_VIEW)
 		return currtid_for_view(rel, tid);
 
+	if (!RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
+		elog(ERROR, "cannot look at latest visible tid for relation \"%s.%s\"",
+			 get_namespace_name(RelationGetNamespace(rel)),
+			 RelationGetRelationName(rel));
+
 	result = (ItemPointer) palloc(sizeof(ItemPointerData));
 	ItemPointerCopy(tid, result);
 
diff --git a/src/test/regress/expected/tid.out b/src/test/regress/expected/tid.out
new file mode 100644
index 00..e7e0d74780
--- /dev/null
+++ b/src/test/regress/expected/tid.out
@@ -0,0 +1,106 @@
+-- tests for functions related to TID handling
+CREATE TABLE tid_tab (a int);
+-- min() and max() for TIDs
+INSERT INTO tid_tab VALUES (1), (2);
+SELECT min(ctid) FROM tid_tab;
+  min  
+---
+ (0,1)
+(1 row)
+
+SELECT max(ctid) FROM tid_tab;
+  max  
+---
+ (0,2)
+(1 row)
+
+TRUNCATE tid_tab;
+-- Tests for currtid() and currtid2() with various relation kinds
+-- Materialized view
+CREATE MATERIALIZED VIEW tid_matview AS SELECT a FROM tid_tab;
+SELECT currtid('tid_matview'::regclass::oid, '(0,1)'::tid); -- fails
+ERROR:  tid (0, 1) is not valid for relation "tid_matview"
+SELECT currtid2('tid_matview'::text, '(0,1)'::tid); -- fails
+ERROR:  tid (0, 1) is not valid for relation "tid_matview"
+INSERT INTO tid_tab VALUES (1);
+REFRESH MATERIALIZED VIEW tid_matview;
+SELECT currtid('tid_matview'::regclass::oid, '(0,1)'::tid); -- ok
+ currtid 
+-
+ (0,1)
+(1 row)
+
+SELECT currtid2('tid_matview'::text, '(0,1)'::tid); -- ok
+ currtid2 
+--
+ (0,1)
+(1 row)
+
+DROP MATERIALIZED VIEW tid_matview;
+TRUNCATE tid_tab;
+-- Sequence
+CREATE SEQUENCE tid_seq;
+SELECT currtid('tid_seq'::regclass::oid, '(0,1)'::tid); -- ok
+ currtid 
+-
+ (0,1)
+(1 row)
+
+SELECT currtid2('tid_seq'::text, '(0,1)'::tid); -- ok
+ currtid2 
+--
+ (0,1)
+(1 row)
+
+DROP SEQUENCE tid_seq;
+-- Index, fails with incorrect relation type
+CREATE INDEX tid_ind ON tid_tab(a);
+SELECT currtid('tid_ind'::regclass::oid, '(0,1)'::tid); -- fails
+ERROR:  "tid_ind" is an index
+SELECT currtid2('tid_ind'::text, '(0,1)'::tid); -- fails
+ERROR:  "tid_ind" is an index
+DROP INDEX tid_ind;
+-- Partitioned table, no storage
+CREATE TABLE tid_part (a int) PARTITION BY RANGE (a);
+SELECT currtid('tid_part'::regclass::oid, '(0,1)'::tid); -- fails
+ERROR:  cannot look at latest 

Re: OpenSSL 3.0.0 compatibility

2020-05-29 Thread Laurenz Albe
On Fri, 2020-05-29 at 00:16 +0200, Daniel Gustafsson wrote:
> Since OpenSSL is now releasing 3.0.0-alpha versions, I took a look at using it
> with postgres to see what awaits us.  As it is now shipping in releases (with
> GA planned for Q4), users will probably soon start to test against it so I
> wanted to be prepared.
> 
> Regarding the deprecations, we can either set preprocessor directives or use
> compiler flags to silence the warning and do nothing (for now), or we could
> update to the new API.  We probably want to different things for master vs
> back-branches, but as an illustration of what the latter could look like I've
> implemented this in 0001.

An important question will be: if we convert to functions that are not 
deprecated,
what is the earliest OpenSSL version we can support?

Yours,
Laurenz Albe