date:20211201

RE: Optionally automatically disable logical replication subscriptions on error

2021-12-01 Thread osumi.takami...@fujitsu.com

On Thursday, December 2, 2021 1:49 PM Amit Kapila  
wrote:
> On Thu, Dec 2, 2021 at 6:35 AM osumi.takami...@fujitsu.com
>  wrote:
> >
> > On Wednesday, December 1, 2021 10:16 PM Amit Kapila
>  wrote:
> > Updated the patch to include the notification.
> >
> The patch disables the subscription for non-transient errors. I am not sure 
> if we
> can easily make the call to decide whether any particular error is transient 
> or
> not. For example, DISK_FULL or OUT_OF_MEMORY might not rectify itself.
> Why not just allow to disable the subscription on any error? And then let the
> user check the error either in view or logs and decide whether it would like 
> to
> enable the subscription or do something before it (like making space in disk, 
> or
> fixing the network).
Agreed. I'll treat any errors as the trigger of the feature
in the next version.

> The other problem I see with this transient error stuff is maintaining the 
> list of
> error codes that we think are transient. I think we need a discussion for 
> each of
> the error_codes we are listing now and whatever new error_code we add in the
> future which doesn't seem like a good idea.
This is also true. The maintenance cost of my current implementation
didn't sound cheap.

> I think the code to deal with apply worker errors and then disable the
> subscription has some flaws. Say, while disabling the subscription if it 
> leads to
> another error then I think the original error won't be reported.  Can't we 
> simply
> emit the error via EmitErrorReport and then do AbortOutOfAnyTransaction,
> FlushErrorState, and any other memory context clean up if required and then
> disable the subscription after coming out of catch?
You are right. I'll fix related parts accordingly.

Best Regards,
Takamichi Osumi

snowball update

2021-12-01 Thread Peter Eisentraut

There is another snowball release out, and I have prepared a patch to 
integrate it.  Since it's quite big and mostly boring, I'm not attaching 
it here, but you can see it at


https://github.com/petere/postgresql/commit/11eade9302d0a737a12f193c41160fb895c0bc67.patch

The upstream release notes are here:

https://github.com/snowballstem/snowball/blob/v2.2.0/NEWS

There are no new user-visible features.

Replace uses of deprecated Python module distutils.sysconfig

2021-12-01 Thread Peter Eisentraut

With Python 3.10, configure spits out warnings about the module 
distutils.sysconfig being deprecated and scheduled for removal in Python 
3.12:


:1: DeprecationWarning: The distutils.sysconfig module is 
deprecated, use sysconfig instead
:1: DeprecationWarning: The distutils package is deprecated and 
slated for removal in Python 3.12. Use setuptools or check PEP 632 for 
potential alternatives


This patch changes the uses in configure to use the module sysconfig 
instead.  The logic stays the same.  (It's basically the same module but 
as its own top-level module.)


Note that sysconfig exists since Python 2.7, so this moves the minimum 
required version up from Python 2.6.


Buildfarm impact:

gaur and prariedog use Python 2.6 and would need to be upgraded.

Possible backpatching:

Backpatching should be considered, since surely someone will otherwise 
complain when Python 3.12 comes around.  But dropping support for Python 
versions in stable branches should be done with some care.


Python 3.10 was released Oct. 4, 2021, so it is quite new.  Python major 
releases are now yearly, so the above-mentioned Python 3.12 can be 
expected in autumn of 2023.


Current PostgreSQL releases support Python versions as follows:

PG10: 2.4+
PG11: 2.4+
PG12: 2.4+  (EOL Nov. 2024)
PG13: 2.6+
PG14: 2.6+

So unfortunately, we won't be able to EOL all versions with Python 2.4 
support before Python 3.12 arrives.


I suggest leaving the backbranches alone for now.  At the moment, we 
don't even know whether additional changes will be required for 3.12 
(and 3.11) support, so the overall impact isn't known yet.  In a few 
months, we will probably know more about this.


In the meantime, the warnings can be silenced using

export PYTHONWARNINGS='ignore::DeprecationWarning'

(It ought to be possible to be more specific, like 
'ignore::DeprecationWarning:distutils.sysconfig', but it doesn't seem to 
work for me.)


(I don't recommend putting that into configure, since then we wouldn't 
be able to learn about issues like this.)From fcfd4f1702f0cef7d0f6564aff94c4a6258f24e6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 2 Dec 2021 08:08:44 +0100
Subject: [PATCH] Replace uses of deprecated Python module distutils.sysconfig

With Python 3.10, configure spits out warnings about the module
distutils.sysconfig being deprecated and scheduled for removal in
Python 3.12.  Change the uses in configure to use the module sysconfig
instead.  The logic stays the same.

Note that sysconfig exists since Python 2.7, so this moves the minimum
required version up from Python 2.6.
---
 config/python.m4   | 28 ++--
 configure  | 30 +++---
 doc/src/sgml/installation.sgml |  4 ++--
 3 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/config/python.m4 b/config/python.m4
index d41aeb2876..8ca1eaa64b 100644
--- a/config/python.m4
+++ b/config/python.m4
@@ -37,28 +37,28 @@ python_majorversion=`echo "$python_fullversion" | sed 
'[s/^\([0-9]*\).*/\1/]'`
 python_minorversion=`echo "$python_fullversion" | sed 
'[s/^[0-9]*\.\([0-9]*\).*/\1/]'`
 python_version=`echo "$python_fullversion" | sed 
'[s/^\([0-9]*\.[0-9]*\).*/\1/]'`
 # Reject unsupported Python versions as soon as practical.
-if test "$python_majorversion" -lt 3 -a "$python_minorversion" -lt 6; then
-  AC_MSG_ERROR([Python version $python_version is too old (version 2.6 or 
later is required)])
+if test "$python_majorversion" -lt 3 -a "$python_minorversion" -lt 7; then
+  AC_MSG_ERROR([Python version $python_version is too old (version 2.7 or 
later is required)])
 fi
 
-AC_MSG_CHECKING([for Python distutils module])
-if "${PYTHON}" -c 'import distutils' 2>_MESSAGE_LOG_FD
+AC_MSG_CHECKING([for Python sysconfig module])
+if "${PYTHON}" -c 'import sysconfig' 2>_MESSAGE_LOG_FD
 then
 AC_MSG_RESULT(yes)
 else
 AC_MSG_RESULT(no)
-AC_MSG_ERROR([distutils module not found])
+AC_MSG_ERROR([sysconfig module not found])
 fi
 
 AC_MSG_CHECKING([Python configuration directory])
-python_configdir=`${PYTHON} -c "import distutils.sysconfig; print(' 
'.join(filter(None,distutils.sysconfig.get_config_vars('LIBPL'"`
+python_configdir=`${PYTHON} -c "import sysconfig; print(' 
'.join(filter(None,sysconfig.get_config_vars('LIBPL'"`
 AC_MSG_RESULT([$python_configdir])
 
 AC_MSG_CHECKING([Python include directories])
 python_includespec=`${PYTHON} -c "
-import distutils.sysconfig
-a = '-I' + distutils.sysconfig.get_python_inc(False)
-b = '-I' + distutils.sysconfig.get_python_inc(True)
+import sysconfig
+a = '-I' + sysconfig.get_path('include')
+b = '-I' + sysconfig.get_path('platinclude')
 if a == b:
 print(a)
 else:
@@ -96,8 +96,8 @@ AC_DEFUN([PGAC_CHECK_PYTHON_EMBED_SETUP],
 [AC_REQUIRE([_PGAC_CHECK_PYTHON_DIRS])
 AC_MSG_CHECKING([how to link an embedded Python application])
 
-python_libdir=`${PYTHON} -c "import distutils.sysconfig; print('

Re: [PATCH] improve the pg_upgrade error message

2021-12-01 Thread Jeevan Ladhe

On Wed, Dec 1, 2021 at 3:45 PM Daniel Gustafsson  wrote:

> > On 1 Dec 2021, at 10:59, Jeevan Ladhe 
> wrote:
>
> > Was wondering if we had any barriers to getting this committed.
>
> No barrier other than available time to, I will try to get to it shortly.
>

Great! Thank you.


> --
> Daniel Gustafsson   https://vmware.com/
>
>

Re: Skipping logical replication transactions on subscriber side

2021-12-01 Thread Amit Kapila

On Wed, Dec 1, 2021 at 11:57 AM houzj.f...@fujitsu.com
 wrote:
>
> On Wednesday, December 1, 2021 1:23 PM Masahiko Sawada 
>  wrote:
> > Okay, I've attached an updated patch. Please review it.
> >
>
> I agreed that checking the result only once makes the test more stable.
> The patch looks good to me.
>

Pushed.

Now, coming back to the skip_xid patch. To summarize the discussion in
that regard so far, we have discussed various alternatives for the
syntax like:

a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription  SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription  On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription  SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = 
lsn = 
...

We didn't prefer (a) as it can lead to more keywords as we add more
options; (b) as we want these new skip options to behave and be set
differently than existing subscription properties because of the
difference in their behavior; (c) as that sounds more like an action
to be performed on a future condition (error/conflict) whereas here we
already knew that an error has happened;

As per discussion till now, option (d) seems preferable.  In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.

I am not sure what exactly we can provide to users to allow skipping
initial table sync as we can't specify XID there. One option that
comes to mind is to allow specifying a combination of copy_data and
relid to skip table sync for a particular relation. We might think of
not doing anything for table sync workers but not sure if that is a
good option.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: SQL/JSON: functions

2021-12-01 Thread Himanshu Upadhyaya

On Wed, Dec 1, 2021 at 7:56 PM Andrew Dunstan  wrote:

>
> On 12/1/21 06:13, Himanshu Upadhyaya wrote:
> > Hi Andrew,
> >
> > The latest version (v59) is not applying on head.
> > Could you please help to rebase?
> >
> >
>
> (Please don't top-post on PostgreSQL lists)
>
> Sure, I will take care of that in the future.

The patches apply for me and for the cfbot:
> . I'm not sure what's not
> working for you. I apply them using "patch -p 1 < $patchfile"
>
> Mistakenly I was using git apply, sorry about that. It's working fine with
"patch -p 1 < $patchfile".

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com

Re: BUG #17302: gist index prevents insertion of some data

2021-12-01 Thread Komяpa

On Thu, Dec 2, 2021 at 1:14 AM Tom Lane  wrote:

> Alexander Korotkov  writes:
> > I think losing precision in the gist penalty is generally OK.  Thus,
> > it shouldn't be a problem to round a very small value as zero.
>
> Check.
>
> > Probably, we could even tolerate overflow in the gist penalty.
>
> As long as overflow -> infinity, yeah I think so.  Seems like it
> was a mistake to insert the overflow-testing functions in this code
> at all, and we should simplify it down to plain C addition/subtraction/
> multiplication.
>

The underflow should not throw an interrupting exception ever, even on
plain SQL-level calculations.

The code to implement was added in error by a series of misunderstandings
and gets in the way of simple things too often. I dug into the history here:

https://www.postgresql.org/message-id/CAC8Q8t%2BXJH68WB%2BsKN0BV0uGc3ZjA2DtbQuoJ5EhB4JAcS0C%2BQ%40mail.gmail.com

>
> regards, tom lane
>
>
>

Re: pg_get_publication_tables() output duplicate relid

2021-12-01 Thread Amit Kapila

On Thu, Dec 2, 2021 at 8:42 AM Amit Langote  wrote:
>
> On Thu, Dec 2, 2021 at 9:44 houzj.f...@fujitsu.com
>  wrote:
> > On Wed, Dec 1, 2021 3:01 PM Amit Kapila  wrote:
> > > > > On Mon, Nov 22, 2021 at 12:55 PM Amit Langote
> > > > >  wrote:
> > > > > > On second thought, I agree that de-duplicating partitions from
> > > > > > this view is an improvement.
> > > > >
> > > > > Fair enough. Hou-San, Can you please submit the updated patch after
> > > > > fixing any pending comments including the test case?
> > > >
> > > > Attach the updated patch which address the comments so far.
> > > >
> > > > The patch only adds a testcase in publication.sql because the
> > > > duplicate output doesn't cause unexpected behavior in pub-sub test.
> > >
> > > Thanks, the patch looks good to me. I have slightly changed the commit
> > > message in the attached. I would like to commit this only in HEAD as 
> > > there is no
> > > user complaint about this and it might not stop any user's service unless 
> > > it relies
> > > on this view's output for the initial table synchronization.
>
> The patch looks good to me too in that it gets the job done.
>
> Reading Alvaro's email at the top again gave me a pause to reconsider
> what I had said in reply.  It might indeed have been nice if the
> publication DDL itself had prevented the cases where a partition and
> its ancestor are added to a publication, though as Hou-san mentioned,
> partition ATTACHes make that a bit tricky to enforce at all times,
> though of course not impossible.  Maybe it's okay to just de-duplicate
> pg_publication_tables output as the patch does, even though it may
> appear to be a band-aid solution if we one considers Alvaro's point
> about consistency.
>

True, I think even if we consider that idea it will be a much bigger
change, and also as it will be a behavioral change so we might want to
keep it just for HEAD and some of these fixes need to be backpatched.
Having said that, if you or someone want to pursue that idea and come
up with a better solution than what we have currently it is certainly
welcome.


> > > What do you think?
> > I agreed that we can commit only in HEAD.
>
> +1 to do this only in HEAD.
>

Okay, If there are no further suggestions or objections, I'll commit
the patch early next week (mostly by Monday).

-- 
With Regards,
Amit Kapila.

Re: GUC flags

2021-12-01 Thread Justin Pryzby

On Thu, Dec 02, 2021 at 02:11:38PM +0900, Michael Paquier wrote:
> On Wed, Dec 01, 2021 at 09:34:39PM -0600, Justin Pryzby wrote:
> >> @@ -2762,7 +2763,8 @@ static struct config_int ConfigureNamesInt[] =
> >>{"pre_auth_delay", PGC_SIGHUP, DEVELOPER_OPTIONS,
> >> -  gettext_noop("Waits N seconds on connection startup 
> >> before authentication."),
> >> +  gettext_noop("Sets the amount of time to wait on 
> >> connection "
> >> +   "startup before 
> >> authentication."),
> >>gettext_noop("This allows attaching a debugger to the 
> >> process."),
> > 
> > I wonder if these should say "Sets the amount of time to wait [before]
> > authentication during connection startup"
> 
> Hmm.  I don't see much a difference between both of wordings in this
> context.

I find it easier to read "wait before authentication ..." than "wait ... before
authentication".

> > BTW the EXPLAIN flag for enable_incremental_sort could be backpatched to 
> > v13.
> 
> This could cause small diffs in EXPLAIN outputs, which could be
> surprising.  This is not worth taking any risks.

Only if one specifies explain(SETTINGS).
It's fine either way ;)

-- 
Justin

Re: GUC flags

2021-12-01 Thread Michael Paquier

On Wed, Dec 01, 2021 at 09:34:39PM -0600, Justin Pryzby wrote:
>> @@ -2762,7 +2763,8 @@ static struct config_int ConfigureNamesInt[] =
>>  {"pre_auth_delay", PGC_SIGHUP, DEVELOPER_OPTIONS,
>> -gettext_noop("Waits N seconds on connection startup 
>> before authentication."),
>> +gettext_noop("Sets the amount of time to wait on 
>> connection "
>> + "startup before 
>> authentication."),
>>  gettext_noop("This allows attaching a debugger to the 
>> process."),
> 
> I wonder if these should say "Sets the amount of time to wait [before]
> authentication during connection startup"

Hmm.  I don't see much a difference between both of wordings in this
context.

>>  gettext_noop("Write a message to the server log if 
>> checkpoints "
>> - "caused by the filling of 
>> checkpoint segment files happens more "
>> + "caused by the filling of WAL 
>> segment files happen more "
>>   "frequently than this number 
>> of seconds. Zero turns off the warning."),
> 
> Should this still say "seconds" ?
> Or change it to "this amount of time"?
> I'm not sure.

Either way would be fine by me, though I'd agree to be consistent and
use "this amount of time" here.

>>  {"log_rotation_size", PGC_SIGHUP, LOGGING_WHERE,
>> -gettext_noop("Automatic log file rotation will occur 
>> after N kilobytes."),
>> +gettext_noop("Sets the maximum size of log file to 
>> reach before "
>> + "forcing log file rotation."),
> 
> Actually, I think that for log_rotation_size, it should not say "forcing".
> 
> "Sets the maximum size a log file can reach before being rotated"

Okay.  Fine by me.

> BTW the EXPLAIN flag for enable_incremental_sort could be backpatched to v13.

This could cause small diffs in EXPLAIN outputs, which could be
surprising.  This is not worth taking any risks.
--
Michael


signature.asc
Description: PGP signature

Re: Is ssl_crl_file "SSL server cert revocation list"?

2021-12-01 Thread Kyotaro Horiguchi

At Thu, 02 Dec 2021 13:54:41 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> As discussed in the thread [1], I find the wording "SSL server
> certificate revocation list" as misleading or plain wrong.

FWIW, I'm convinced that that's plain wrong after finding some
occurances of "(SSL) client certificate" in the doc.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: More time spending with "delete pending"

2021-12-01 Thread Michael Paquier

On Wed, Dec 01, 2021 at 11:51:58AM +0100, Daniel Gustafsson wrote:
> Michael, have you had a chance to look at the updated patch here?  I'm moving
> this patch entry from Waiting on Author to Needs Review.

No, I haven't.  Before being touched more, this requires first a bit
more of testing infrastructure based on MinGW, and I don't have that
in place yet :/
--
Michael


signature.asc
Description: PGP signature

Re: pg_waldump stucks with options --follow or -f and --stats or -z

2021-12-01 Thread Michael Paquier

On Wed, Dec 01, 2021 at 11:44:02AM +0530, Bharath Rupireddy wrote:
> Here's the v6 patch that has the SIGINT handling enabled for non-WIN32
> platforms (note that I haven't specified anything in the
> documentation) much like pg_receivewal and pg_recvlogical.

I have added a safeguard in XLogDumpDisplayStats() to not print any
stats if the end LSN is not set yet, which would be possible if
pg_waldump is signaled at a very early stage, and I have added some
more comments.  That looked fine after that, so applied.
--
Michael


signature.asc
Description: PGP signature

Is ssl_crl_file "SSL server cert revocation list"?

2021-12-01 Thread Kyotaro Horiguchi

As discussed in the thread [1], I find the wording "SSL server
certificate revocation list" as misleading or plain wrong.

I used to read it as "SSL server certificate (of PostgreSQL client)
revocation list" but I find it misleading-ish from fresh eyes. So I'd
like to propose a change of the doc as attached.

What do you think about this?

[1] 
https://www.postgresql.org/message-id/20211202.134619.1052008069537649171.horikyota.ntt%40gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ab617c7b86..4ac617615c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1248,7 +1248,7 @@ include_dir 'conf.d'
   
   

-Specifies the name of the file containing the SSL server certificate
+Specifies the name of the file containing the SSL client certificate
 revocation list (CRL).
 Relative paths are relative to the data directory.
 This parameter can only be set in the postgresql.conf
@@ -1267,7 +1267,7 @@ include_dir 'conf.d'
   
   

-Specifies the name of the directory containing the SSL server
+Specifies the name of the directory containing the SSL client
 certificate revocation list (CRL).  Relative paths are relative to the
 data directory.  This parameter can only be set in
 the postgresql.conf file or on the server command
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c17d33a54f..eb3a0c6b55 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1742,11 +1742,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   sslcrl
   

-This parameter specifies the file name of the SSL certificate
+This parameter specifies the file name of the SSL server certificate
 revocation list (CRL).  Certificates listed in this file, if it
-exists, will be rejected while attempting to authenticate the
-server's certificate.  If neither
- nor
+exists, will be rejected while attempting to authenticate the server's
+certificate.  If neither  nor
  is set, this setting is
 taken as
 ~/.postgresql/root.crl.
@@ -1758,9 +1757,9 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   sslcrldir
   

-This parameter specifies the directory name of the SSL certificate
-revocation list (CRL).  Certificates listed in the files in this
-directory, if it exists, will be rejected while attempting to
+This parameter specifies the directory name of the SSL server
+certificate revocation list (CRL).  Certificates listed in the files
+in this directory, if it exists, will be rejected while attempting to
 authenticate the server's certificate.

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Amit Kapila

On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow  wrote:
>
> On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow  wrote:
> >
> > If you updated my original description to say "(instead of just the
> > individual partitions)", it would imply the same I think.
> > But I don't mind if you want to explicitly state both cases to make it 
> > clear.
> >
>
> For example, something like:
>
> For publications of partitioned tables with publish_via_partition_root
> set to true, only the partitioned table (and not its partitions) is
> included in the view, whereas if publish_via_partition_root is set to
> false, only the individual partitions are included in the view.
>

Yeah, that sounds good to me.

-- 
With Regards,
Amit Kapila.

Re: Optionally automatically disable logical replication subscriptions on error

2021-12-01 Thread Amit Kapila

On Thu, Dec 2, 2021 at 6:35 AM osumi.takami...@fujitsu.com
 wrote:
>
> On Wednesday, December 1, 2021 10:16 PM Amit Kapila  
> wrote:
> Updated the patch to include the notification.
>

The patch disables the subscription for non-transient errors. I am not
sure if we can easily make the call to decide whether any particular
error is transient or not. For example, DISK_FULL or OUT_OF_MEMORY
might not rectify itself. Why not just allow to disable the
subscription on any error? And then let the user check the error
either in view or logs and decide whether it would like to enable the
subscription or do something before it (like making space in disk, or
fixing the network).

The other problem I see with this transient error stuff is maintaining
the list of error codes that we think are transient. I think we need a
discussion for each of the error_codes we are listing now and whatever
new error_code we add in the future which doesn't seem like a good
idea.

I think the code to deal with apply worker errors and then disable the
subscription has some flaws. Say, while disabling the subscription if
it leads to another error then I think the original error won't be
reported.  Can't we simply emit the error via EmitErrorReport and then
do AbortOutOfAnyTransaction, FlushErrorState, and any other memory
context clean up if required and then disable the subscription after
coming out of catch?

-- 
With Regards,
Amit Kapila.

Re: row filtering for logical replication

2021-12-01 Thread Ajin Cherian

On Wed, Dec 1, 2021 at 3:27 AM vignesh C  wrote:
>
> Here we will not be able to do a direct comparison as we store the
> transformed where clause in the pg_publication_rel table. We will have
> to transform the where clause and then check. I have attached a patch
> where we can check the transformed where clause and see if the where
> clause is the same or not. If you are ok with this approach you could
> make similar changes.

thanks for your patch, I have used the same logic with minor changes
and shared it with Peter for v44.

regards,
Ajin Cherian
Fujitsu Australia

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Greg Nancarrow

On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow  wrote:
>
> If you updated my original description to say "(instead of just the
> individual partitions)", it would imply the same I think.
> But I don't mind if you want to explicitly state both cases to make it clear.
>

For example, something like:

For publications of partitioned tables with publish_via_partition_root
set to true, only the partitioned table (and not its partitions) is
included in the view, whereas if publish_via_partition_root is set to
false, only the individual partitions are included in the view.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-01 Thread Justin Pryzby

On Wed, Dec 01, 2021 at 04:59:44PM -0500, Melanie Plageman wrote:
> Thanks for the review!
> 
> On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby  wrote:
> > You wrote beentry++ at the start of two loops, but I think that's wrong; it
> > should be at the end, as in the rest of the file (or as a loop increment).
> > BackendStatusArray[0] is actually used (even though its backend has
> > backendId==1, not 0).  "MyBEEntry = [MyBackendId - 1];"
> 
> I've fixed this in v16 which I will attach to the next email in the thread.

I just noticed that since beentry++ is now at the end of the loop, it's being
missed when you "continue":

+   if (beentry->st_procpid == 0)
+   continue;

Also, I saw that pgindent messed up and added spaces after pointers in function
declarations, due to new typedefs not in typedefs.list:

-pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter *msg)
+pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter * msg)

-static inline void pg_atomic_inc_counter(pg_atomic_uint64 *counter)
+static inline void
+pg_atomic_inc_counter(pg_atomic_uint64 * counter)

-- 
Justin

Re: Can I assume relation would not be invalid during from ExecutorRun to ExecutorEnd

2021-12-01 Thread Andy Fan

> If the relation is open,
> the relcache entry can't be destroyed altogether, but it can be
> rebuilt: see RelationClearRelation().


Thanks! This is a new amazing knowledge for me!


> They are in fact stable in a certain sense - if we have the relation open

we hold a reference count on it, and so the Relation pointer itself will

remain valid.


This sounds amazing as well.


> But
> the data it points to can change in various ways, and different
> members of the RelationData struct are handled differently. Separately
> from the reference count, the heavyweight lock that we also hold on
> the relation as a condition of opening it prevents certain kinds of
> changes, so that even if the relation cache entry is rebuilt, certain
> particular fields will be unaffected. Which fields are protected in
> this way will depend on what kind of lock is held. It's hard to speak
> in general terms.


Amazing++;


> The best advice I can give you is (1) look exactly
> what RelationClearRelation() is going to do to the fields you care
> about if a rebuild happens, (2) err on the side of assuming that
> things can change under you, and (3) try running your code under
> debug_discard_caches = 1. It will be slow that way, but it's pretty
> effective in finding places where you've made unsafe assumptions.
>
>
Thanks! I clearly understand what's wrong in my previous knowledge.
That is, after a relation is open with some lock, then the content of the
relation
will never change until the RelationClose.  It would take time to fill the
gap, but I'd like to say "thank you!" first.

-- 
Best Regards
Andy Fan

Re: GUC flags

2021-12-01 Thread Justin Pryzby

> @@ -2142,7 +2142,8 @@ static struct config_int ConfigureNamesInt[] =
>   {"post_auth_delay", PGC_BACKEND, DEVELOPER_OPTIONS,
> - gettext_noop("Waits N seconds on connection startup 
> after authentication."),
> + gettext_noop("Sets the amount of time to wait on 
> connection "
> +  "startup after 
> authentication."),

> @@ -2762,7 +2763,8 @@ static struct config_int ConfigureNamesInt[] =
>   {"pre_auth_delay", PGC_SIGHUP, DEVELOPER_OPTIONS,
> - gettext_noop("Waits N seconds on connection startup 
> before authentication."),
> + gettext_noop("Sets the amount of time to wait on 
> connection "
> +  "startup before 
> authentication."),
>   gettext_noop("This allows attaching a debugger to the 
> process."),

I wonder if these should say "Sets the amount of time to wait [before]
authentication during connection startup"

>   {"checkpoint_warning", PGC_SIGHUP, WAL_CHECKPOINTS,
> - gettext_noop("Enables warnings if checkpoint segments 
> are filled more "
> -  "frequently than this."),
> + gettext_noop("Sets the maximum time before warning if 
> checkpoints "
> +  "triggered by WAL volume 
> happen too frequently."),
>   gettext_noop("Write a message to the server log if 
> checkpoints "
> -  "caused by the filling of 
> checkpoint segment files happens more "
> +  "caused by the filling of WAL 
> segment files happen more "
>"frequently than this number 
> of seconds. Zero turns off the warning."),

Should this still say "seconds" ?
Or change it to "this amount of time"?
I'm not sure.

>   {"log_rotation_age", PGC_SIGHUP, LOGGING_WHERE,
> - gettext_noop("Automatic log file rotation will occur 
> after N minutes."),
> + gettext_noop("Sets the amount of time to wait before 
> forcing "
> +  "log file rotation."),
>   NULL,
>   GUC_UNIT_MIN
>   },
> @@ -3154,7 +3159,8 @@ static struct config_int ConfigureNamesInt[] =
>  
>   {
>   {"log_rotation_size", PGC_SIGHUP, LOGGING_WHERE,
> - gettext_noop("Automatic log file rotation will occur 
> after N kilobytes."),
> + gettext_noop("Sets the maximum size of log file to 
> reach before "
> +  "forcing log file rotation."),

Actually, I think that for log_rotation_size, it should not say "forcing".

"Sets the maximum size a log file can reach before being rotated"

BTW the EXPLAIN flag for enable_incremental_sort could be backpatched to v13.

-- 
Justin

RE: row filtering for logical replication

2021-12-01 Thread tanghy.f...@fujitsu.com

On Thursday, December 2, 2021 5:21 AM Peter Smith  wrote:
> 
> PSA the v44* set of patches.
> 

Thanks for the new patch. Few comments:

1. This is an example in publication doc, but in fact it's not allowed. Should 
we
change this example?

+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS 
TRUE);

postgres=# CREATE PUBLICATION active_departments FOR TABLE departments WHERE 
(active IS TRUE);
ERROR:  invalid publication WHERE expression for relation "departments"
HINT:  only simple expressions using columns, constants and immutable system 
functions are allowed

2. A typo in 0002 patch.

+ * drops such a user-defnition or if there is any other error via its function,

"user-defnition" should be "user-definition".

Regards,
Tang

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-01 Thread Justin Pryzby

On Wed, Dec 01, 2021 at 04:59:44PM -0500, Melanie Plageman wrote:
> Thanks for the review!
> 
> On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby  wrote:
> > You wrote beentry++ at the start of two loops, but I think that's wrong; it
> > should be at the end, as in the rest of the file (or as a loop increment).
> > BackendStatusArray[0] is actually used (even though its backend has
> > backendId==1, not 0).  "MyBEEntry = [MyBackendId - 1];"
> 
> I've fixed this in v16 which I will attach to the next email in the thread.
> 
> > You could put *_NUM_TYPES as the last value in these enums, like
> > NUM_AUXPROCTYPES, NUM_PMSIGNALS, and NUM_PROCSIGNALS:
> >
> > +#define IOOP_NUM_TYPES (IOOP_WRITE + 1)
> > +#define IOPATH_NUM_TYPES (IOPATH_STRATEGY + 1)
> > +#define BACKEND_NUM_TYPES (B_LOGGER + 1)
> 
> I originally had it as you describe, but based on this feedback upthread
> from Álvaro Herrera:

I saw that after I made my suggestion.  Sorry for the noise.
Both ways already exist in postgres and seem to be acceptable.

> > There's extraneous blank lines in these functions:
> > +pgstat_recv_resetsharedcounter
> I didn't see one here

=> The extra blank line is after the RESET_BUFFERS memset.

> +* Reset the global, bgwriter and checkpointer statistics for the
> +* cluster.

The first comma in this comment was introduced in 1bc8e7b09, and seems to be
extraneous, since bgwriter and checkpointer are both global.  With the comma,
it looks like it should be memsetting 3 things.

> +   /* Don't count dead backends. They should already be counted 
> */

Maybe this comment should say ".. they'll be added below"

> +   row[COLUMN_BACKEND_TYPE] = backend_type_desc;
> +   row[COLUMN_IO_PATH] = 
> CStringGetTextDatum(GetIOPathDesc(io_path));
> +   row[COLUMN_ALLOCS] += io_ops->allocs - resets->allocs;
> +   row[COLUMN_EXTENDS] += io_ops->extends - 
> resets->extends;
> +   row[COLUMN_FSYNCS] += io_ops->fsyncs - resets->fsyncs;
> +   row[COLUMN_WRITES] += io_ops->writes - resets->writes;
> +   row[COLUMN_RESET_TIME] = reset_time;

It'd be clearer if RESET_TIME were set adjacent to BACKEND_TYPE and IO_PATH.

> > Rather than memset(), you could initialize msg like this.
> > PgStat_MsgIOPathOps msg = {0};
>
> though changing the initialization to universal zero initialization
> seems to be the correct way, I do get this compiler warning when I make
> the change
> 
> pgstat.c:3212:29: warning: suggest braces around initialization of subobject 
> [-Wmissing-braces]
> 
> I have seem some comments online that say that this is a spurious
> warning present with some versions of both gcc and clang when using
> -Wmissing-braces to compile code with universal zero initialization, but
> I'm not sure what I should do.

I think gcc is suggesting to write something like {{0}}, and I think the online
commentary you found is saying that the warning is a false positive.
So I think you should ignore my suggestion - it's not worth the bother.

This message needs to be updated:
errhint("Target must be \"archiver\", \"bgwriter\", or \"wal\".")))

When I query the view, I see reset times as: 1999-12-31 18:00:00-06.
I guess it should be initialized like this one:
globalStats.bgwriter.stat_reset_timestamp = ts

The cfbot shows failures now (I thought it was passing with the previous patch,
but I suppose I'm wrong.)

It looks like running recovery during single user mode hits this assertion.
TRAP: FailedAssertion("beentry", File: 
"../../../../src/include/utils/backend_status.h", Line: 359, PID: 3499)

-- 
Justin

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Amit Langote

On Wed, Dec 1, 2021 at 8:15 PM Amit Kapila  wrote:
> On Mon, Nov 29, 2021 at 2:21 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Thursday, November 11, 2021 2:53 PM houzj.f...@fujitsu.com wrote:
> > > Attach the fix patch.
> > > 0001 fix data double publish(first issue in this thread)
> >
> > In another thread[1], Amit L suggested that it'd be nice to add a testcase 
> > in
> > src/test/subscription/. So, attach a new version patch which add a testcase 
> > in
> > t/013_partition.pl.
> >
>
> Thanks, your patch looks good to me. I have slightly changed the
> comments and commit message in the attached.

Patch looks good to me too.  I confirmed that the newly added
subscription test fails with HEAD.

> I think we should back-patch this but I am slightly worried that if
> someone is dependent on the view pg_publication_tables to return both
> parent and child tables for publications that have both of those
> tables and published with publish_via_partition_root as true then this
> might break his usage. But OTOH, I don't see why someone would do like
> that and she might face some problems like what we are trying to solve
> here.

Yeah, back-patching may not be such a bad idea.

Thank you.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Re: pg_get_publication_tables() output duplicate relid

2021-12-01 Thread Amit Langote

On Thu, Dec 2, 2021 at 9:44 houzj.f...@fujitsu.com
 wrote:
> On Wed, Dec 1, 2021 3:01 PM Amit Kapila  wrote:
> > > > On Mon, Nov 22, 2021 at 12:55 PM Amit Langote
> > > >  wrote:
> > > > > On second thought, I agree that de-duplicating partitions from
> > > > > this view is an improvement.
> > > >
> > > > Fair enough. Hou-San, Can you please submit the updated patch after
> > > > fixing any pending comments including the test case?
> > >
> > > Attach the updated patch which address the comments so far.
> > >
> > > The patch only adds a testcase in publication.sql because the
> > > duplicate output doesn't cause unexpected behavior in pub-sub test.
> >
> > Thanks, the patch looks good to me. I have slightly changed the commit
> > message in the attached. I would like to commit this only in HEAD as there 
> > is no
> > user complaint about this and it might not stop any user's service unless 
> > it relies
> > on this view's output for the initial table synchronization.

The patch looks good to me too in that it gets the job done.

Reading Alvaro's email at the top again gave me a pause to reconsider
what I had said in reply.  It might indeed have been nice if the
publication DDL itself had prevented the cases where a partition and
its ancestor are added to a publication, though as Hou-san mentioned,
partition ATTACHes make that a bit tricky to enforce at all times,
though of course not impossible.  Maybe it's okay to just de-duplicate
pg_publication_tables output as the patch does, even though it may
appear to be a band-aid solution if we one considers Alvaro's point
about consistency.  Sorry about too many second thoughts on these
threads. :(

> > What do you think?
> I agreed that we can commit only in HEAD.

+1 to do this only in HEAD.

Thank you.

--
Amit Langote
EDB: http://www.enterprisedb.com

Re: pg_replslotdata - a tool for displaying replication slot information

2021-12-01 Thread Bharath Rupireddy

On Thu, Dec 2, 2021 at 4:22 AM Andres Freund  wrote:
>
> Hi,
>
> On 2021-11-30 18:43:23 +, Bossart, Nathan wrote:
> > On 11/30/21, 6:14 AM, "Peter Eisentraut" 
> >  wrote:
> > > On 23.11.21 06:09, Bharath Rupireddy wrote:
> > >> The replication slots data is stored in binary format on the disk under
> > >> the pg_replslot/<> directory which isn't human readable. If
> > >> the server is crashed/down (for whatever reasons) and unable to come up,
> > >> currently there's no way for the user/admin/developer to know what were
> > >> all the replication slots available at the time of server crash/down to
> > >> figure out what's the restart lsn, xid, two phase info or types of slots
> > >> etc.
> > >
> > > What do you need that for?  You can't do anything with a replication
> > > slot while the server is down.
>
> Yes, I don't think there's sufficient need for this.

Thanks. The idea of the pg_replslotdata is emanated from the real-time
working experience with the customer issues and answering some of
their questions. Given the fact that replication slots are used in
almost every major production servers, and they are likely to cause
problems, postgres having a core tool like pg_replslotdata to
interpret the replication slot info without contacting the server,
will definitely be useful for all the other postgres vendors out
there. Having some important tool in the core, can avoid duplicate
efforts.

> > I don't have any other compelling use- cases at the moment, but I will say
> > that it is typically nice from an administrative standpoint to be able to
> > inspect things like this without logging into a running server.
>
> Weighed against the cost of maintaining (including documentation) a separate
> tool this doesn't seem sufficient reason.

IMHO, this shouldn't be a reason to not get something useful (useful
IMO and few others in this thread) into the core. The maintenance of
the tools generally is low compared to the core server features once
they get reviewed and committed.

Having said that, other hackers may have better thoughts.

Regards,
Bharath Rupireddy.

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Greg Nancarrow

On Thu, Dec 2, 2021 at 1:33 PM Amit Kapila  wrote:
>
> > For publications of partitioned tables with publish_via_partition_root
> > set to true, the partitioned table itself (rather than the individual
> > partitions) is included in the view.
> >
>
> Okay, but I think it is better to add the behavior both when
> publish_via_partition_root is set to true and false. As in the case of
> false, it won't include the partitioned table itself.
>

If you updated my original description to say "(instead of just the
individual partitions)", it would imply the same I think.
But I don't mind if you want to explicitly state both cases to make it clear.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread Bharath Rupireddy

On Thu, Dec 2, 2021 at 1:54 AM Bossart, Nathan  wrote:
>
> Hi hackers,
>
> Thanks to 61752af, SyncDataDirectory() can make use of syncfs() to
> avoid individually syncing all database files after a crash.  However,
> as noted earlier this year [0], there are still a number of O(n) tasks
> that affect startup and checkpointing that I'd like to improve.
> Below, I've attempted to summarize each task and to offer ideas for
> improving matters.  I'll likely split each of these into its own
> thread, given there is community interest for such changes.
>
> 1) CheckPointSnapBuild(): This function loops through
>pg_logical/snapshots to remove all snapshots that are no longer
>needed.  If there are many entries in this directory, this can take
>a long time.  The note above this function indicates that this is
>done during checkpoints simply because it is convenient.  IIUC
>there is no requirement that this function actually completes for a
>given checkpoint.  My current idea is to move this to a new
>maintenance worker.
> 2) CheckPointLogicalRewriteHeap(): This function loops through
>pg_logical/mappings to remove old mappings and flush all remaining
>ones.  IIUC there is no requirement that the "remove old mappings"
>part must complete for a given checkpoint, but the "flush all
>remaining" portion allows replay after a checkpoint to only "deal
>with the parts of a mapping that have been written out after the
>checkpoint started."  Therefore, I think we should move the "remove
>old mappings" part to a new maintenance worker (probably the same
>one as for 1), and we should consider using syncfs() for the "flush
>all remaining" part.  (I suspect the main argument against the
>latter will be that it could cause IO spikes.)
> 3) RemovePgTempFiles(): This step can delay startup if there are many
>temporary files to individually remove.  This step is already
>optionally done after a crash via the remove_temp_files_after_crash
>GUC.  I propose that we have startup move the temporary file
>directories aside and create new ones, and then a separate worker
>(probably the same one from 1 and 2) could clean up the old files.
> 4) StartupReorderBuffer(): This step deletes logical slot data that
>has been spilled to disk.  This code appears to be written to avoid
>deleting different types of files in these directories, but AFAICT
>there shouldn't be any other files.  Therefore, I think we could do
>something similar to 3 (i.e., move the directories aside during
>startup and clean them up via a new maintenance worker).
>
> I realize adding a new maintenance worker might be a bit heavy-handed,
> but I think it would be nice to have somewhere to offload tasks that
> really shouldn't impact startup and checkpointing.  I imagine such a
> process would come in handy down the road, too.  WDYT?

+1 for the overall idea of making the checkpoint faster. In fact, we
here at our team have been thinking about this problem for a while. If
there are a lot of files that checkpoint has to loop over and remove,
IMO, that task can be delegated to someone else (maybe a background
worker called background cleaner or bg cleaner, of course, we can have
a GUC to enable or disable it). The checkpoint can just write some
marker files (for instance, it can write snapshot_ files
with file name itself representing the cutoff lsn so that the new bg
cleaner can remove the snapshot files, similarly it can write marker
files for other file removals). Having said that, a new bg cleaner
deleting the files asynchronously on behalf of checkpoint can look an
overkill until we have some numbers that we could save with this
approach. For this purpose, I did a small experiment to figure out how
much usually file deletion takes [1] on a SSD, for 1million files
8seconds, I'm sure it will be much more on HDD.

The bg cleaner can also be used for RemovePgTempFiles, probably the
postmaster just renaming the pgsql_temp to something
pgsql_temp_delete, then proceeding with the server startup, the bg
cleaner can then delete the files.
Also, we could do something similar for removing/recycling old xlog
files and StartupReorderBuffer.

Another idea could be to parallelize the checkpoint i.e. IIUC, the
tasks that checkpoint do in CheckPointGuts are independent and if we
have some counters like (how many snapshot/mapping files that the
server generated)

[1] on SSD:
deletion of 100 files took 7.930380 seconds
deletion of 50 files took 3.921676 seconds
deletion of 10 files took 0.768772 seconds
deletion of 5 files took 0.400623 seconds
deletion of 1 files took 0.077565 seconds
deletion of 1000 files took 0.006232 seconds

Regards,
Bharath Rupireddy.

Re: Optionally automatically disable logical replication subscriptions on error

2021-12-01 Thread Greg Nancarrow

On Thu, Dec 2, 2021 at 12:05 PM osumi.takami...@fujitsu.com
 wrote:
>
> Updated the patch to include the notification.
>

For the catalogs.sgml update, I was thinking that the following
wording might sound a bit better:

+   If true, the subscription will be disabled when a subscription
+   worker detects non-transient errors (e.g. duplication error)
+   that require user intervention to resolve.

What do you think?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Amit Kapila

On Thu, Dec 2, 2021 at 4:51 AM Greg Nancarrow  wrote:
>
> On Wed, Dec 1, 2021 at 10:15 PM Amit Kapila  wrote:
> >
>
> Also, perhaps the following additional comment (or similar) could be
> added to the pg_publication_tables documentation in catalogs.sgml:
>
> For publications of partitioned tables with publish_via_partition_root
> set to true, the partitioned table itself (rather than the individual
> partitions) is included in the view.
>

Okay, but I think it is better to add the behavior both when
publish_via_partition_root is set to true and false. As in the case of
false, it won't include the partitioned table itself.

-- 
With Regards,
Amit Kapila.

Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread Euler Taveira

On Wed, Dec 1, 2021, at 9:19 PM, Bossart, Nathan wrote:
> On 12/1/21, 2:56 PM, "Andres Freund"  wrote:
> > On 2021-12-01 20:24:25 +, Bossart, Nathan wrote:
> >> I realize adding a new maintenance worker might be a bit heavy-handed,
> >> but I think it would be nice to have somewhere to offload tasks that
> >> really shouldn't impact startup and checkpointing.  I imagine such a
> >> process would come in handy down the road, too.  WDYT?
> >
> > -1. I think the overhead of an additional worker is disproportional here. 
> > And
> > there's simplicity benefits in having a predictable cleanup interlock as 
> > well.
> 
> Another idea I had was to put some upper limit on how much time is
> spent on such tasks.  For example, a checkpoint would only spend X
> minutes on CheckPointSnapBuild() before giving up until the next one.
> I think the main downside of that approach is that it could lead to
> unbounded growth, so perhaps we would limit (or even skip) such tasks
> only for end-of-recovery and shutdown checkpoints.  Perhaps the
> startup tasks could be limited in a similar fashion.
Saying that a certain task is O(n) doesn't mean it needs a separate process to
handle it. Did you have a use case or even better numbers (% of checkpoint /
startup time) that makes your proposal worthwhile?

I would try to optimize (1) and (2). However, delayed removal can be a
long-term issue if the new routine cannot keep up with the pace of file
creation (specially if the checkpoints are far apart).

For (3), there is already a GUC that would avoid the slowdown during startup.
Use it if you think the startup time is more important that disk space occupied
by useless files.

For (4), you are forgetting that the on-disk state of replication slots is
stored in the pg_replslot/SLOTNAME/state. It seems you cannot just rename the
replication slot directory and copy the state file. What happen if there is a
crash before copying the state file?

While we are talking about items (1), (2) and (4), we could probably have an
option to create some ephemeral logical decoding files into ramdisk (similar to
statistics directory). I wouldn't like to hijack this thread but this proposal
could alleviate the possible issues that you pointed out. If people are
interested in this proposal, I can start a new thread about it.

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: pg_upgrade test for binary compatibility of core data types

2021-12-01 Thread Michael Paquier

On Wed, Dec 01, 2021 at 04:19:44PM +0900, Michael Paquier wrote:
> I'll get that done down to 10 to maximize its influence, then I'll
> move on with the buildfarm code and send a patch to plug this and
> reduce the dependencies between core and the buildfarm code.

Okay, I have checked this one this morning, and applied the split down
to 10, so as we have a way to fix objects from the main regression
test suite.  The buildfarm client gets a bit cleaned up after that (I
have a patch for that, but I am not 100% sure that it is right).

Still, the global picture is larger than that because there is still
nothing done for contrib/ modules included in cross-version checks of
pg_upgrade by the buildfarm.  The core code tests don't do this much,
but if we were to do the same things as the buildfarm, then we would
need to run installcheck-world (roughly) on a deployed instance, then
pg_upgrade it.  That's not going to be cheap, for sure.

One thing that we could do is to use unique names for the databases of
the contrib/ modules when running an installcheck, so as these are
preserved for upgrades (the buildfarm client does that).  This has as
effect to increase the number of databases for an instance
installcheck'ed, so this had better be optional, at least.
--
Michael

signature.asc
Description: PGP signature

RE: Optionally automatically disable logical replication subscriptions on error

2021-12-01 Thread osumi.takami...@fujitsu.com

On Wednesday, December 1, 2021 10:16 PM Amit Kapila  
wrote:
> On Wed, Dec 1, 2021 at 5:55 PM osumi.takami...@fujitsu.com
>  wrote:
> >
> > On Wednesday, December 1, 2021 3:02 PM vignesh C
>  wrote:
> > > On Tue, Nov 30, 2021 at 5:34 PM osumi.takami...@fujitsu.com
> > >  wrote:
> >
> > > 3) Can add a line in the commit message saying "Bump catalog version."
> > > as the patch involves changing the catalog.
> > Hmm, let me postpone this fix till the final version.
> > The catalog version gets easily updated by other patch commits and
> > including it in the middle of development can become cause of
> > conflicts of my patch when applied to the PG, which is possible to
> > make other reviewers stop reviewing.
> >
> 
> Vignesh seems to be suggesting just changing the commit message, not the
> actual code. This is sort of a reminder to the committer to change the 
> catversion
> before pushing the patch. So that shouldn't cause any conflicts while applying
> your patch.
Ah, sorry for my misunderstanding.
Updated the patch to include the notification.


Best Regards,
Takamichi Osumi



v10-0001-Optionally-disable-subscriptions-on-error.patch
Description: v10-0001-Optionally-disable-subscriptions-on-error.patch

Re: BUFFERS enabled by default in EXPLAIN (ANALYZE)

2021-12-01 Thread Justin Pryzby

On Mon, Nov 15, 2021 at 01:09:54PM -0600, Justin Pryzby wrote:
> Some time ago, I had a few relevant patches:
> 1) add explain(REGRESS) which is shorthand for (BUFFERS OFF, TIMING OFF, 
> COSTS OFF, SUMMARY OFF)
> 2) add explain(MACHINE) which elides machine-specific output from explain;
>for example, Heap Fetches, sort spaceUsed, hash nbuckets, and tidbitmap 
> stuff.
> 
> https://www.postgresql.org/message-id/flat/20200306213310.gm...@telsasoft.com

The attached patch series now looks like this (some minor patches are not
included in this list):

 1. add GUC explain_regress, which disables TIMING, SUMMARY, COSTS;
 2. enable explain(BUFFERS) by default (but disabled by explain_regress);
 3. Add explain(MACHINE) - which is disabled by explain_regress.
This elides various machine-specific output like Memory and Disk use.
Maybe it should be called something else like "QUIET" or "VERBOSE_MINUS_1"
or ??

The regression tests now automatically run with explain_regress=on, which is
shorthand for TIMING OFF, SUMMARY OFF, COSTS OFF, and then BUFFERS OFF.

There's a further option called explain(MACHINE) which can be disabled to hide
portions of the output that are unstable, like Memory/Disk/Batches/
Heap Fetches/Heap Blocks.  This allows "explain analyze" to be used more easily
in regression tests, and simplifies some existing tests that currently use
plpgsql functions to filter the output.  But it doesn't handle all the
variations from parallel workers.

(3) is optional, but simplifies some regression tests.  The patch series could
be rephrased with (3) first.

Unfortunately, "COSTS OFF" breaks postgres_fdw remote_estimate.  If specifying
"COSTS ON" in postgres_fdw.c is considered to be a poor fix , then I suppose
this patch series could do as suggested and enable buffers by default only when
ANALYZE is specified.  Then postgres_fdw is not affected, and the
explain_regress GUC is optional: instead, we'd need to specify BUFFERS OFF in
~100 regression tests which use EXPLAIN ANALYZE.  (3) still seems useful on its
own.
>From 6804bc6945f7c7e0d51beb1930b80cf2ee0e3f82 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 22 Feb 2020 21:17:10 -0600
Subject: [PATCH 01/11] Add GUC: explain_regress

This changes the defaults for explain to: costs off, timing off, summary off.
It'd be reasonable to use this for new regression tests which are not intended
to be backpatched.
---
 contrib/postgres_fdw/postgres_fdw.c |  2 +-
 src/backend/commands/explain.c  | 13 +++--
 src/backend/utils/misc/guc.c| 13 +
 src/include/commands/explain.h  |  2 ++
 src/test/regress/expected/explain.out   |  3 +++
 src/test/regress/expected/inherit.out   |  2 +-
 src/test/regress/expected/stats_ext.out |  2 +-
 src/test/regress/pg_regress.c   |  3 ++-
 src/test/regress/sql/explain.sql|  4 
 src/test/regress/sql/inherit.sql|  2 +-
 src/test/regress/sql/stats_ext.sql  |  2 +-
 11 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index fa9a099f13..5cd90d7404 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3112,7 +3112,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * values, so don't request params_list.
 		 */
 		initStringInfo();
-		appendStringInfoString(, "EXPLAIN ");
+		appendStringInfoString(, "EXPLAIN (COSTS)");
 		deparseSelectStmtForRel(, root, foreignrel, fdw_scan_tlist,
 remote_conds, pathkeys,
 fpextra ? fpextra->has_final_sort : false,
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 09f5253abb..e5362fb560 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -154,6 +154,7 @@ static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
 static void escape_yaml(StringInfo buf, const char *str);
 
+bool explain_regress = false; /* GUC */
 
 
 /*
@@ -172,6 +173,7 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	ListCell   *lc;
 	bool		timing_set = false;
 	bool		summary_set = false;
+	bool		costs_set = false;
 
 	/* Parse options list. */
 	foreach(lc, stmt->options)
@@ -183,7 +185,11 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 		else if (strcmp(opt->defname, "verbose") == 0)
 			es->verbose = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "costs") == 0)
+		{
+			/* Need to keep track if it was explicitly set to ON */
+			costs_set = true;
 			es->costs = defGetBoolean(opt);
+		}
 		else if (strcmp(opt->defname, "buffers") == 0)
 			es->buffers = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "wal") == 0)
@@ -227,13 +233,16 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* if the costs option was not set explicitly, set default value */
+	es->costs = (costs_set) ? es->costs :

RE: pg_get_publication_tables() output duplicate relid

2021-12-01 Thread houzj.f...@fujitsu.com

On Wed, Dec 1, 2021 3:01 PM Amit Kapila  wrote:
> On Mon, Nov 29, 2021 at 2:37 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Wed, Nov 24, 2021 4:48 PM Amit Kapila 
> > > On Mon, Nov 22, 2021 at 12:55 PM Amit Langote
> > > 
> > > wrote:
> > > >
> > > > On Fri, Nov 19, 2021 at 2:28 PM Amit Kapila
> > > > 
> > > wrote:
> > > > > On Fri, Nov 19, 2021 at 7:19 AM Amit Langote
> > > > > 
> > > wrote:
> > > > > >  As in,
> > > > > > do we know of any replication (initial/streaming) misbehavior
> > > > > > caused by the duplicate partition OIDs in this case or is the
> > > > > > only problem that pg_publication_tables output looks odd?
> > > > >
> > > > > The latter one but I think either we should document this or
> > > > > change it as we can't assume users will follow what subscriber-side
> > > > > code does.
> > > >
> > > > On second thought, I agree that de-duplicating partitions from
> > > > this view is an improvement.
> > > >
> > >
> > > Fair enough. Hou-San, Can you please submit the updated patch after
> > > fixing any pending comments including the test case?
> >
> > Attach the updated patch which address the comments so far.
> >
> > The patch only adds a testcase in publication.sql because the
> > duplicate output doesn't cause unexpected behavior in pub-sub test.
> >
> 
> Thanks, the patch looks good to me. I have slightly changed the commit
> message in the attached. I would like to commit this only in HEAD as there is 
> no
> user complaint about this and it might not stop any user's service unless it 
> relies
> on this view's output for the initial table synchronization.
> 
> What do you think?
I agreed that we can commit only in HEAD.

Best regards,
Hou zj

Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread Bossart, Nathan

On 12/1/21, 2:56 PM, "Andres Freund"  wrote:
> On 2021-12-01 20:24:25 +, Bossart, Nathan wrote:
>> I realize adding a new maintenance worker might be a bit heavy-handed,
>> but I think it would be nice to have somewhere to offload tasks that
>> really shouldn't impact startup and checkpointing.  I imagine such a
>> process would come in handy down the road, too.  WDYT?
>
> -1. I think the overhead of an additional worker is disproportional here. And
> there's simplicity benefits in having a predictable cleanup interlock as well.

Another idea I had was to put some upper limit on how much time is
spent on such tasks.  For example, a checkpoint would only spend X
minutes on CheckPointSnapBuild() before giving up until the next one.
I think the main downside of that approach is that it could lead to
unbounded growth, so perhaps we would limit (or even skip) such tasks
only for end-of-recovery and shutdown checkpoints.  Perhaps the
startup tasks could be limited in a similar fashion.

> I think particularly for the snapshot stuff it'd be better to optimize away
> unnecessary snapshot files, rather than making the cleanup more asynchronous.

I can look into this.  Any pointers would be much appreciated.

Nathan

Re: [PATCH] Partial foreign key updates in referential integrity triggers

2021-12-01 Thread Paul Martinez

On Wed, Nov 24, 2021 at 10:59 AM Peter Eisentraut
 wrote:
>
> I was looking through this example to see if it could be adapted for the
> documentation.
>
> The way the users table is defined, it appears that "id" is actually
> unique and the primary key ought to be just (id).  The DELETE command
> you show also just uses the id column to find the user, which would be
> bad if the user id is not unique across tenants.  If the id were unique,
> then the foreign key from posts to users would just use the user id
> column and the whole problem of the ON DELETE SET NULL action would go
> away.  If the primary key of users is indeed supposed to be (tenant_id,
> id), then maybe the definition of the users.id column should not use
> serial, and the DELETE command should also look at the tenant_id column.
> (The same question applies to posts.id.)
>
> Also, you initially wrote that this is a denormalized schema.  I think
> if we keep the keys the way you show, then this isn't denormalized.  But
> if we considered users.id globally unique, then there would be
> normalization concerns.
>
> What do you think?

Regarding that specific example, in a production scenario, yes, the
DELETE command should reference both columns. And if used for
documentation both columns should be referenced for clarity/correctness.

I don't think the exact semantics regarding the uniqueness of the id
column are critical. Configuring separate auto-incrementing ids per
tenant would be fairly complex; practically speaking, a single
database with multi-tenant data will use serial to get auto-incrementing
ids (or else use UUIDs to prevent conflicts). The possibility of
conflicting ids likely won't arise until moving to a distributed
environment, at which point queries should only be routed towards a
single shard (where uniqueness will still hold), either by some higher
level application-level context, or by including the tenant_id as part
of the query.

I think there are three separate motivating use cases for using
(tenant_id, id) as primary keys everywhere in a multi-tenant database:

1) I initially encountered this problem while migrating a database to use
Citus, which requires that primary keys (and any other uniqueness
constraints) include the shard key, which forces the primary key to be
(tenant_id, id). I'm not sure what constraints other sharding solutions
enforce, but I don't feel like this feature is over-fitting to Citus'
specific implementation -- it seems like a pretty
reasonable/generalizable solution when sharding data: prefix all your
indexes with the shard key.

2) As I mentioned in my response to Tom in my original proposal thread,
and as Matthias alluded to, using composite primary keys grants
significantly stronger referential integrity by preventing cross-tenant
references. I think this represents a significant leap in the robustness
and security of a schema, to the point where you could consider it a
design flaw to _not_ use composite keys.

https://www.postgresql.org/message-id/flat/CAF%2B2_SFFCjWMpxo0cj3yaqMavcb3Byd0bSG%2B0UPs7RVb8EF99g%40mail.gmail.com#c0e2b37b223bfbf8ece561f02865286c

3) For performance reasons, indexes on foreign keys will often be
prefixed by the tenant_id to speed up index scans. (I think
algorithmically doing an index lookup on (fk_id) vs. (tenant_id, fk_id)
has the same complexity, but repeated index scans, such as when doing a
join, should in practice be more efficient when including a tenant_id,
because most queries will only reference a single tenant so the looked
up values are more likely to be on the same pages.) If a foreign key
only references the id column, then ON DELETE CASCADE triggers will only
use the id column in their DELETE query. Thus, to ensure that deletes
are still fast, you will need to create an index on (fk_id) in addition
to the (tenant_id, fk_id) index, which would cause _significant_
database bloat. (In practice, the presence of both indexes will also
confuse the query planner and now BOTH indexes will take up precious
space in the database's working memory, so it really creates all sorts
of problems.) Using a composite foreign key will ensure that ON DELETE
CASCADE trigger query will use both columns.

- Paul

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-01 Thread Justin Pryzby

On Wed, Dec 01, 2021 at 05:00:14PM -0500, Melanie Plageman wrote:
> > Also:
> > src/include/miscadmin.h:#define BACKEND_NUM_TYPES (B_LOGGER + 1)
> >
> > I think it's wrong to say NUM_TYPES = B_LOGGER + 1 (which would suggest 
> > using
> > lessthan-or-equal instead of lessthan as you are).
> >
> > Since the valid backend types start at 1 , the "count" of backend types is
> > currently B_LOGGER (13) - not 14.  I think you should remove the "+1" here.
> > Then NROWS (if it continued to exist at all) wouldn't need to subtract one.
> 
> I think what I currently have is technically correct because I start at
> 1 when I am using it as a loop condition. I do waste a spot in the
> arrays I allocate with BACKEND_NUM_TYPES size.
> 
> I was hesitant to make the value of BACKEND_NUM_TYPES == B_LOGGER
> because it seems kind of weird to have it have the same value as the
> B_LOGGER enum.

I don't mean to say that the code is misbehaving - I mean "num_x" means "the
number of x's" - how many there are.  Since the first, valid backend type is 1,
and they're numbered consecutively and without duplicates, then "the number of
backend types" is the same as the value of the last one (B_LOGGER).  It's
confusing if there's a macro called BACKEND_NUM_TYPES which is greater than the
number of backend types.

Most loops say for (int i=0; i

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Greg Nancarrow

On Wed, Dec 1, 2021 at 10:15 PM Amit Kapila  wrote:
>
> Thanks, your patch looks good to me. I have slightly changed the
> comments and commit message in the attached.
>

I'd suggest tidying the patch comment a bit:

"We publish the child table's data twice for a publication that has both
child and parent tables and is published with publish_via_partition_root
as true. This happens because subscribers will initiate synchronization
using both parent and child tables, since it gets both as separate tables
in the initial table list."

Also, perhaps the following additional comment (or similar) could be
added to the pg_publication_tables documentation in catalogs.sgml:

For publications of partitioned tables with publish_via_partition_root
set to true, the partitioned table itself (rather than the individual
partitions) is included in the view.

> I think we should back-patch this but I am slightly worried ...

I'd be in favor of back-patching this.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread Andres Freund

Hi,

On 2021-12-01 20:24:25 +, Bossart, Nathan wrote:
> I realize adding a new maintenance worker might be a bit heavy-handed,
> but I think it would be nice to have somewhere to offload tasks that
> really shouldn't impact startup and checkpointing.  I imagine such a
> process would come in handy down the road, too.  WDYT?

-1. I think the overhead of an additional worker is disproportional here. And
there's simplicity benefits in having a predictable cleanup interlock as well.

I think particularly for the snapshot stuff it'd be better to optimize away
unnecessary snapshot files, rather than making the cleanup more asynchronous.

Greetings,

Andres Freund

Re: pg_replslotdata - a tool for displaying replication slot information

2021-12-01 Thread Andres Freund

Hi,

On 2021-11-30 18:43:23 +, Bossart, Nathan wrote:
> On 11/30/21, 6:14 AM, "Peter Eisentraut"  
> wrote:
> > On 23.11.21 06:09, Bharath Rupireddy wrote:
> >> The replication slots data is stored in binary format on the disk under
> >> the pg_replslot/<> directory which isn't human readable. If
> >> the server is crashed/down (for whatever reasons) and unable to come up,
> >> currently there's no way for the user/admin/developer to know what were
> >> all the replication slots available at the time of server crash/down to
> >> figure out what's the restart lsn, xid, two phase info or types of slots
> >> etc.
> >
> > What do you need that for?  You can't do anything with a replication
> > slot while the server is down.

Yes, I don't think there's sufficient need for this.


> I don't have any other compelling use- cases at the moment, but I will say
> that it is typically nice from an administrative standpoint to be able to
> inspect things like this without logging into a running server.

Weighed against the cost of maintaining (including documentation) a separate
tool this doesn't seem sufficient reason.

Greetings,

Andres Freund

Re: Deprecating the term "super-exclusive lock"

2021-12-01 Thread Andres Freund

Hi,

On 2021-11-30 16:21:25 -0800, Peter Geoghegan wrote:
> In commit 276db875, I made vacuumlazy.c consistently use the term
> "cleanup lock", rather than the term "super-exclusive lock". But on
> further reflection I should have gone further, and removed the term
> "super-exclusive lock" from the tree completely. The actual relevant C
> symbols only use the term cleanup.

+1

Greetings,

Andres Freund

Re: BUG #17302: gist index prevents insertion of some data

2021-12-01 Thread Tom Lane

Alexander Korotkov  writes:
> I think losing precision in the gist penalty is generally OK.  Thus,
> it shouldn't be a problem to round a very small value as zero.

Check.

> Probably, we could even tolerate overflow in the gist penalty.

As long as overflow -> infinity, yeah I think so.  Seems like it
was a mistake to insert the overflow-testing functions in this code
at all, and we should simplify it down to plain C addition/subtraction/
multiplication.

regards, tom lane

Re: BUG #17302: gist index prevents insertion of some data

2021-12-01 Thread Alexander Korotkov

On Sun, Nov 28, 2021 at 9:07 PM PG Bug reporting form
 wrote:
> The last statement in the following sequence of queries:
> CREATE TABLE point_tbl (f1 point);
> CREATE INDEX gpointind ON point_tbl USING gist (f1);
> INSERT INTO point_tbl SELECT '(0,0)'::point FROM generate_series(1, 1000)
> g;
> INSERT INTO point_tbl VALUES ('(1e-300,-1e-300)'::point);
> produces:
> ERROR:  value out of range: underflow
> (The error occurs inside gist_box_penalty()->box_penalty()->size_box().)
> But the following sequence:
> CREATE TABLE point_tbl (f1 point);
> INSERT INTO point_tbl SELECT '(0,0)'::point FROM generate_series(1, 1000)
> g;
> INSERT INTO point_tbl VALUES ('(1e-300,-1e-300)'::point);
> executes without an error. Moreover, the same index can be created
> successfully after the insertion. The error is also depends on number of the
> points inserted in the first step.

I think losing precision in the gist penalty is generally OK.  Thus,
it shouldn't be a problem to round a very small value as zero.
Probably, we could even tolerate overflow in the gist penalty.  Should
be much worse than underflow, because we might consider a very bad
penalty as very good (or vise versa).  But it still affects only index
quality, not correctness.

Any thoughts?

--
Regards,
Alexander Korotkov

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-01 Thread Melanie Plageman

v16 (also rebased) attached

On Fri, Nov 26, 2021 at 4:16 PM Justin Pryzby  wrote:
>
> On Wed, Nov 24, 2021 at 07:15:59PM -0600, Justin Pryzby wrote:
> > There's extraneous blank lines in these functions:
> >
> > +pgstat_sum_io_path_ops
> > +pgstat_report_live_backend_io_path_ops
> > +pgstat_recv_resetsharedcounter
> > +GetIOPathDesc
> > +StrategyRejectBuffer
>
> + an extra blank line pgstat_reset_shared_counters.

Fixed

>
> In 0005:
>
> monitoring.sgml says that the columns in pg_stat_buffers are integers, but
> they're actually bigint.

Fixed

>
> +   tupstore = tuplestore_begin_heap(true, false, work_mem);
>
> You're passing a constant randomAccess=true to tuplestore_begin_heap ;)

Fixed

>
> +Datum all_values[NROWS][COLUMN_LENGTH];
>
> If you were to allocate this as an array, I think it could actually be 3-D:
> Datum all_values[BACKEND_NUM_TYPES-1][IOPATH_NUM_TYPES][COLUMN_LENGTH];

I've changed this to a 3D array as you suggested and removed the NROWS
macro.

> But I don't know if this is portable across postgres' supported platforms; I
> haven't seen any place which allocates a multidimensional array on the stack,
> nor passes one to a function:
>
> +static inline Datum *
> +get_pg_stat_buffers_row(Datum all_values[NROWS][COLUMN_LENGTH], BackendType 
> backend_type, IOPath io_path)
>
> Maybe the allocation half is okay (I think it's ~3kB), but it seems easier to
> palloc the required amount than to research compiler behavior.

I think passing it to the function is okay. The parameter type would be
adjusted from an array to a pointer.
I am not sure if the allocation on the stack in the body of
pg_stat_get_buffers is too large. (left as is for now)

> That function is only used as a one-line helper, and doesn't use
> multidimensional array access anyway:
>
> +   return all_values[(backend_type - 1) * IOPATH_NUM_TYPES + io_path];

with your suggested changes to a 3D array, it now does use multidimensional
array access

> I think it'd be better as a macro, like (I think)
> #define ROW(backend_type, io_path) all_values[NROWS*(backend_type-1)+io_path]

If I am understanding the idea of the macro, it would change the call
site from this:

+Datum *values = get_pg_stat_buffers_row(all_values,
beentry->st_backendType, io_path);

+values[COLUMN_ALLOCS] += pg_atomic_read_u64(_ops->allocs);
+values[COLUMN_FSYNCS] += pg_atomic_read_u64(_ops->fsyncs);

to this:

+Datum *row =  ROW(beentry->st_backendType, io_path);

+row[COLUMN_ALLOCS] += pg_atomic_read_u64(_ops->allocs);
+row[COLUMN_FSYNCS] += pg_atomic_read_u64(_ops->fsyncs);

I usually prefer functions to macros, but I am fine with changing it.
(I did not change it in this version)
I have changed all the local variables from "values" to "row" which
I think is a bit clearer.

> Maybe it should take the column type as a 3 arg.

If I am understanding this idea, the call site would look like this now:
+CELL(beentry->st_backendType, io_path, COLUMN_FSYNCS) +=
pg_atomic_read_u64(_ops->fsyncs);
+CELL(beentry->st_backendType, io_path, COLUMN_ALLOCS) +=
pg_atomic_read_u64(_ops->allocs);

I don't like this as much. Since this code is inside of a loop, it kind
of makes sense to me that you get a row at the top of the loop and then
fill in all the cells in the row using that "row" variable.

> The enum with COLUMN_LENGTH should be named.

I only use the values in it, so it didn't need a name.

> Or maybe it should be removed, and the enum names moved to comments, like:
>
> +   /* backend_type */
> +   values[val++] = backend_type_desc;
>
> +   /* io_path */
> +   values[val++] = 
> CStringGetTextDatum(GetIOPathDesc(io_path));
>
> +   /* allocs */
> +   values[val++] += io_ops->allocs - resets->allocs;
> ...

I find it easier to understand with it in code instead of as a comment.

> *Note the use of += and not =.

Thanks for seeing this. I have changed this (to use +=).

> Also:
> src/include/miscadmin.h:#define BACKEND_NUM_TYPES (B_LOGGER + 1)
>
> I think it's wrong to say NUM_TYPES = B_LOGGER + 1 (which would suggest using
> lessthan-or-equal instead of lessthan as you are).
>
> Since the valid backend types start at 1 , the "count" of backend types is
> currently B_LOGGER (13) - not 14.  I think you should remove the "+1" here.
> Then NROWS (if it continued to exist at all) wouldn't need to subtract one.

I think what I currently have is technically correct because I start at
1 when I am using it as a loop condition. I do waste a spot in the
arrays I allocate with BACKEND_NUM_TYPES size.

I was hesitant to make the value of BACKEND_NUM_TYPES == B_LOGGER
because it seems kind of weird to have it have the same value as the
B_LOGGER enum.

I am open to changing it. (I didn't change it in this v16).

- Melanie


v16-0006-Remove-superfluous-bgwriter-stats.patch
Description: Binary data


v16-0004-Add-buffers-to-pgstat_reset_shared_counters.patch

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-01 Thread Melanie Plageman

Thanks for the review!

On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby  wrote:
> You wrote beentry++ at the start of two loops, but I think that's wrong; it
> should be at the end, as in the rest of the file (or as a loop increment).
> BackendStatusArray[0] is actually used (even though its backend has
> backendId==1, not 0).  "MyBEEntry = [MyBackendId - 1];"

I've fixed this in v16 which I will attach to the next email in the thread.

> You could put *_NUM_TYPES as the last value in these enums, like
> NUM_AUXPROCTYPES, NUM_PMSIGNALS, and NUM_PROCSIGNALS:
>
> +#define IOOP_NUM_TYPES (IOOP_WRITE + 1)
> +#define IOPATH_NUM_TYPES (IOPATH_STRATEGY + 1)
> +#define BACKEND_NUM_TYPES (B_LOGGER + 1)

I originally had it as you describe, but based on this feedback upthread
from Álvaro Herrera:

> (It's weird to have enum values that are there just to indicate what's
> the maximum value.  I think that sort of thing is better done by having
> a "#define LAST_THING" that takes the last valid value from the enum.
> That would free you from having to handle the last value in switch
> blocks, for example.  LAST_OCLASS in dependency.h is a precedent on this.)

So, I changed it to use macros.

> There's extraneous blank lines in these functions:
>
> +pgstat_sum_io_path_ops

Fixed

> +pgstat_report_live_backend_io_path_ops

I didn't see one here

> +pgstat_recv_resetsharedcounter

I didn't see one here

> +GetIOPathDesc

Fixed

> +StrategyRejectBuffer

Fixed

> This function is doubly-indented:
>
> +pgstat_send_buffers_reset

Fixed. Thanks for catching this.
I also ran pgindent and manually picked a few of the formatting fixes
that were relevant to code I added.

>
> As support for C99 is now required by postgres, variables can be declared as
> part of various loops.
>
> +   int io_path;
> +   for (io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++)

Fixed this and all other occurrences in my code.

> Rather than memset(), you could initialize msg like this.
> PgStat_MsgIOPathOps msg = {0};
>
> +pgstat_send_buffers(void)
> +{
> +   PgStat_MsgIOPathOps msg;
> +
> +   PgBackendStatus *beentry = MyBEEntry;
> +
> +   if (!beentry)
> +   return;
> +
> +   memset(, 0, sizeof(msg));
>

though changing the initialization to universal zero initialization
seems to be the correct way, I do get this compiler warning when I make
the change

pgstat.c:3212:29: warning: suggest braces around initialization of
subobject [-Wmissing-braces]
PgStat_MsgIOPathOps msg = {0};
   ^
   {}
I have seem some comments online that say that this is a spurious
warning present with some versions of both gcc and clang when using
-Wmissing-braces to compile code with universal zero initialization, but
I'm not sure what I should do.

v16 attached in next message

- Melanie

Re: Fix inappropriate uses of PG_GETARG_UINT32()

2021-12-01 Thread Bossart, Nathan

On 12/1/21, 10:29 AM, "Peter Eisentraut"  
wrote:
> The attached patch fixes this by accepting the argument using
> PG_GETARG_INT32(), doing some checks, and then casting it to unsigned
> for the rest of the code.
>
> The patch also fixes another inappropriate use in an example in the
> documentation.  These two were the only inappropriate uses I found,
> after we had fixed a few recently.

LGTM

Nathan

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Tue, Nov 30, 2021 at 9:34 PM vignesh C  wrote:
>
...
> Thanks for the updated patch, few comments:
> 1) Should this be changed to include non IMMUTABLE system functions
> are not allowed:
> +   not-null constraints in the WHERE clause. The
> +   WHERE clause does not allow functions or user-defined
> +   operators.
> +  
>

Updated docs in v44 [1]

> 2) We can remove the #if 0 code if we don't plan to keep it in the final 
> patch.
> --- a/src/backend/parser/parse_agg.c
> +++ b/src/backend/parser/parse_agg.c
> @@ -552,11 +552,12 @@ check_agglevels_and_constraints(ParseState
> *pstate, Node *expr)
>
> break;
> case EXPR_KIND_PUBLICATION_WHERE:
> +#if 0
> if (isAgg)
> err = _("aggregate functions are not
> allowed in publication WHERE expressions");
> else
> err = _("grouping operations are not
> allowed in publication WHERE expressions");
> -
> +#endif
>

Fixed in v44 [1]

> 4) Should this be changed, since we error out if publisher without
> replica identify performs delete or update:
> +   The WHERE clause must contain only columns that are
> +   covered  by REPLICA IDENTITY, or are part of the 
> primary
> +   key (when REPLICA IDENTITY is not set), otherwise
> +   DELETE or UPDATE operations will not
> +   be replicated. That's because old row is used and it only contains primary
> +   key or columns that are part of the REPLICA IDENTITY; 
> the
> +   remaining columns are NULL. For 
> INSERT
>
> to:
> +   The WHERE clause must contain only columns that are
> +   covered  by REPLICA IDENTITY, or are part of the 
> primary
> +   key (when REPLICA IDENTITY is not set), otherwise
> +   DELETE or UPDATE operations will be
> +   disallowed on those tables. That's because old row is used and it
> only contains primary
> +   key or columns that are part of the REPLICA IDENTITY; 
> the
> +   remaining columns are NULL. For 
> INSERT
>

Updated docs in v44 [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjxzedJPbSZyb9pd72%2BUrGEj6HagQQbCdO0YJvr7OyJg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Tue, Nov 30, 2021 at 2:49 PM tanghy.f...@fujitsu.com
 wrote:
>
> On Thursday, November 25, 2021 11:22 AM Peter Smith  
> wrote:
> >
> > Thanks for all the review comments so far! We are endeavouring to keep
> > pace with them.
> >
> > All feedback is being tracked and we will fix and/or reply to everything 
> > ASAP.
> >
> > Meanwhile, PSA the latest set of v42* patches.
> >
> > This version was mostly a patch restructuring exercise but it also
> > addresses some minor review comments in passing.
> >
>
> Thanks for your patch.
> I have two comments on the document in 0001 patch.
>
> 1.
> +   New row is used and it contains all columns. A NULL 
> value
> +   causes the expression to evaluate to false; avoid using columns without
>
> I don't quite understand this sentence 'A NULL value causes the expression to 
> evaluate to false'.
> The expression contains NULL value can also return true. Could you be more 
> specific?
>
> For example:
>
> postgres=# select null or true;
>  ?column?
> --
>  t
> (1 row)
>

Updated publication docs in v44 [1].

>
> 2.
> +   at all then all other filters become redundant. If the subscriber is a
> +   PostgreSQL version before 15 then any row 
> filtering
> +   is ignored.
>
> If the subscriber is a PostgreSQL version before 15, it seems row filtering 
> will
> be ignored only when copying initial data, the later changes will not be 
> ignored in row
> filtering. Should we make it clear in document?

Updated subscription docs in v44 [1].

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjxzedJPbSZyb9pd72%2BUrGEj6HagQQbCdO0YJvr7OyJg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila  wrote:
>
> On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
> >
> > Attaching a new patchset v41 which includes changes by both Peter and 
> > myself.
> >
> > Patches v40-0005 and v40-0006 have been merged to create patch
> > v41-0005 which reduces the patches to 6 again.
> > This patch-set contains changes addressing the following review comments:
> >
> > On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila  wrote:
> > >
> > > What I meant was that with this new code we have regressed the old
> > > behavior. Basically, imagine a case where no filter was given for any
> > > of the tables. Then after the patch, we will remove all the old tables
> > > whereas before the patch it will remove the oldrels only when they are
> > > not specified as part of new rels. If you agree with this, then we can
> > > retain the old behavior and for the new tables, we can always override
> > > the where clause for a SET variant of command.
> >
> > Fixed and modified the behaviour to match with what the schema patch
> > implemented.
> >
>
> +
> + /*
> + * If the new relation or the old relation has a where clause,
> + * we need to remove it so that it can be added afresh later.
> + */
> + if (RelationGetRelid(newpubrel->relation) == oldrelid &&
> + newpubrel->whereClause == NULL && rfisnull)
>
> Can't we use _equalPublicationTable() here? It compares the whereClause as 
> well.
>

Fixed in v44 [1]

> Few more comments:
> =
> 0001
...

.
> 3. In the function header comment of rowfilter_walker, you mentioned
> the simple expressions allowed but we should write why we are doing
> so. It has been discussed in detail in various emails in this thread.
> AFAIR, below are the reasons:
> A. We don't want to allow user-defined functions or operators because
> (a) if the user drops such a function/operator or if there is any
> other error via that function, the walsender won't be able to recover
> from such an error even if we fix the function's problem because it
> uses a historic snapshot to access row-filter; (b) any other table
> could be accessed via a function which won't work because of historic
> snapshots in logical decoding environment.
>
> B. We don't allow anything other immutable built-in functions as those
> can access database and would lead to the problem (b) mentioned in the
> previous paragraph.
>

Updated comment in v44 [1]

> Don't we need to check for user-defined types similar to user-defined
> functions and operators? If not why?

Fixed in v44 [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjxzedJPbSZyb9pd72%2BUrGEj6HagQQbCdO0YJvr7OyJg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Wed, Nov 24, 2021 at 3:37 AM vignesh C  wrote:
>
> On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
> >
> > Attaching a new patchset v41 which includes changes by both Peter and 
> > myself.
>
> Few comments on v41-0002 patch:
...
> 2) Tab completion completes with "WHERE (" in case of "alter
> publication pub1 add table t1,":
> +   /* ALTER PUBLICATION  SET TABLE  */
> +   /* ALTER PUBLICATION  ADD TABLE  */
> +   else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD",
> "TABLE", MatchAny))
> +   COMPLETE_WITH("WHERE (");
>
> Should this be changed to:
> +   /* ALTER PUBLICATION  SET TABLE  */
> +   /* ALTER PUBLICATION  ADD TABLE  */
> +   else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET|ADD",
> "TABLE", MatchAny) && (!ends_with(prev_wd, ','))
> +   COMPLETE_WITH("WHERE (");
>

Fixed in v44 [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjxzedJPbSZyb9pd72%2BUrGEj6HagQQbCdO0YJvr7OyJg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Tue, Nov 23, 2021 at 6:59 PM houzj.f...@fujitsu.com
 wrote:
>
> On Tues, Nov 23, 2021 2:27 PM vignesh C  wrote:
> > On Thu, Nov 18, 2021 at 7:04 AM Peter Smith 
> > wrote:
> > >
> > > PSA new set of v40* patches.
> >
> > Few comments:
...
> Another comment about v40-0001 patch:
>
>
> +   char *relname = pstrdup(RelationGetRelationName(rel));
> +
> table_close(rel, ShareUpdateExclusiveLock);
> +
> +   /* Disallow duplicate tables if there are any with 
> row-filters. */
> +   if (t->whereClause || list_member_oid(relids_with_rf, 
> myrelid))
> +   ereport(ERROR,
> +   
> (errcode(ERRCODE_DUPLICATE_OBJECT),
> +errmsg("conflicting or 
> redundant row-filters for \"%s\"",
> +   relname)));
> +   pfree(relname);
>
> Maybe we can do the error check before table_close(), so that we don't need to
> invoke pstrdup() and pfree().
>

Fixed in v44 [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjxzedJPbSZyb9pd72%2BUrGEj6HagQQbCdO0YJvr7OyJg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Tue, Nov 23, 2021 at 5:27 PM vignesh C  wrote:
>

> 2) Since the error message is because it publishes delete/update
> operations, it should include publish delete/update in the error
> message. Can we change the error message:
> +   if (!bms_is_member(attnum -
> FirstLowInvalidHeapAttributeNumber, context->bms_replident))
> +   {
> +   const char *colname = get_attname(relid, attnum, 
> false);
> +
> +   ereport(ERROR,
> +
> (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> +   errmsg("cannot add relation
> \"%s\" to publication",
> +
> RelationGetRelationName(context->rel)),
> +   errdetail("Row filter column
> \"%s\" is not part of the REPLICA IDENTITY",
> + colname)));
> +   }
>
> To something like:
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> errmsg("cannot add relation \"%s\" to publication because row filter
> column \"%s\" does not have a replica identity and publishes
> deletes/updates",
>RelationGetRelationName(context->rel), colname),
> errhint("To enable deleting/updating from the table, set REPLICA
> IDENTITY using ALTER TABLE")));
>

The "top-up" patch 0005 (see v43*) is already addressing this now.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread SATYANARAYANA NARLAPURAM

+1 to the idea. I don't see a reason why checkpointer has to do all of
that. Keeping checkpoint to minimal essential work helps servers recover
faster in the event of a crash.

RemoveOldXlogFiles is also an O(N) operation that can at least be avoided
during the end of recovery (CHECKPOINT_END_OF_RECOVERY) checkpoint. When a
sufficient number of WAL files accumulated and the previous checkpoint did
not get a chance to cleanup, this can increase the unavailability of the
server.

RemoveOldXlogFiles(_logSegNo, RedoRecPtr, recptr);



On Wed, Dec 1, 2021 at 12:24 PM Bossart, Nathan  wrote:

> Hi hackers,
>
> Thanks to 61752af, SyncDataDirectory() can make use of syncfs() to
> avoid individually syncing all database files after a crash.  However,
> as noted earlier this year [0], there are still a number of O(n) tasks
> that affect startup and checkpointing that I'd like to improve.
> Below, I've attempted to summarize each task and to offer ideas for
> improving matters.  I'll likely split each of these into its own
> thread, given there is community interest for such changes.
>
> 1) CheckPointSnapBuild(): This function loops through
>pg_logical/snapshots to remove all snapshots that are no longer
>needed.  If there are many entries in this directory, this can take
>a long time.  The note above this function indicates that this is
>done during checkpoints simply because it is convenient.  IIUC
>there is no requirement that this function actually completes for a
>given checkpoint.  My current idea is to move this to a new
>maintenance worker.
> 2) CheckPointLogicalRewriteHeap(): This function loops through
>pg_logical/mappings to remove old mappings and flush all remaining
>ones.  IIUC there is no requirement that the "remove old mappings"
>part must complete for a given checkpoint, but the "flush all
>remaining" portion allows replay after a checkpoint to only "deal
>with the parts of a mapping that have been written out after the
>checkpoint started."  Therefore, I think we should move the "remove
>old mappings" part to a new maintenance worker (probably the same
>one as for 1), and we should consider using syncfs() for the "flush
>all remaining" part.  (I suspect the main argument against the
>latter will be that it could cause IO spikes.)
> 3) RemovePgTempFiles(): This step can delay startup if there are many
>temporary files to individually remove.  This step is already
>optionally done after a crash via the remove_temp_files_after_crash
>GUC.  I propose that we have startup move the temporary file
>directories aside and create new ones, and then a separate worker
>(probably the same one from 1 and 2) could clean up the old files.
> 4) StartupReorderBuffer(): This step deletes logical slot data that
>has been spilled to disk.  This code appears to be written to avoid
>deleting different types of files in these directories, but AFAICT
>there shouldn't be any other files.  Therefore, I think we could do
>something similar to 3 (i.e., move the directories aside during
>startup and clean them up via a new maintenance worker).
>
> I realize adding a new maintenance worker might be a bit heavy-handed,
> but I think it would be nice to have somewhere to offload tasks that
> really shouldn't impact startup and checkpointing.  I imagine such a
> process would come in handy down the road, too.  WDYT?
>
> Nathan
>
> [0] https://postgr.es/m/32B59582-AA6C-4609-B08F-2256A271F7A5%40amazon.com
>
>

Re: row filtering for logical replication

2021-12-01 Thread Peter Smith

On Thu, Sep 23, 2021 at 10:33 PM Tomas Vondra
 wrote:
>
> 2) create_publication.sgml says:
>
> A NULL value causes the expression to evaluate
> to false; avoid using columns without not-null constraints in the
> WHERE clause.
>
> That's not quite correct, I think - doesn't the expression evaluate to
> NULL (which is not TRUE, so it counts as mismatch)?
>
> I suspect this whole paragraph (talking about NULL in old/new rows)
> might be a bit too detailed / low-level for user docs.
>

Updated docs in v44 [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjxzedJPbSZyb9pd72%2BUrGEj6HagQQbCdO0YJvr7OyJg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Column Filtering in Logical Replication

2021-12-01 Thread Alvaro Herrera

Oh, I just noticed that for some reason the test file was lost in the
rebase, so those tests I thought I was running ... I wasn't.  And of
course if I put it back, it fails.

More later.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Crear es tan difícil como ser libre" (Elsa Triolet)

Re: Lots of memory allocated when reassigning Large Objects

2021-12-01 Thread Guillaume Lelarge

Le mer. 1 déc. 2021 à 19:48, Tom Lane  a écrit :

> Justin Pryzby  writes:
> > @Guillaume: Even if memory use with the patch isn't constant, I imagine
> it's
> > enough to have avoided OOM.
>
> I think it's good enough in HEAD.  In the back branches, the sinval
> queue growth is bad enough that there's still an issue.  Still,
> this is a useful improvement, so I added some comments and pushed it.
>
>
Thanks.


-- 
Guillaume.

Re: Column Filtering in Logical Replication

2021-12-01 Thread Alvaro Herrera

Hi

I took the latest posted patch, rebased on current sources, fixed the
conflicts, and pgindented.  No further changes.  Here's the result.  All
tests are passing for me.  Some review comments that were posted have
not been addressed yet; I'll look into that soon.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming"  (A. Stepanov)
>From bb01d00a9ee8f19bc1d9c36bde6cd0ca178c859c Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 6 Sep 2021 10:34:29 -0300
Subject: [PATCH v6] Add column filtering to logical replication

Add capability to specifiy column names while linking
the table to a publication, at the time of CREATE or ALTER
publication. This will allow replicating only the specified
columns. Other columns, if any, on the subscriber will be populated
locally or NULL will be inserted if no value is supplied for the column
by the upstream during INSERT.
This facilitates replication to a table on subscriber
containing only the subscribed/filtered columns.
If no filter is specified, all the columns are replicated.
REPLICA IDENTITY columns are always replicated.
Thus, prohibit adding relation to publication, if column filters
do not contain REPLICA IDENTITY.
Add a tap test for the same in src/test/subscription.

Author: Rahila Syed 
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektnrh8nj1jf...@mail.gmail.com
---
 src/backend/access/common/relation.c|  22 +
 src/backend/catalog/pg_publication.c|  58 ++-
 src/backend/commands/publicationcmds.c  |   8 +-
 src/backend/nodes/copyfuncs.c   |   1 +
 src/backend/nodes/equalfuncs.c  |   1 +
 src/backend/parser/gram.y   |  30 +-
 src/backend/replication/logical/proto.c | 103 
 src/backend/replication/logical/tablesync.c | 101 +--
 src/backend/replication/pgoutput/pgoutput.c |  77 ---
 src/include/catalog/pg_publication.h|   1 +
 src/include/catalog/pg_publication_rel.h|   4 +
 src/include/nodes/parsenodes.h  |   1 +
 src/include/replication/logicalproto.h  |   6 +-
 src/include/utils/rel.h |   1 +
 14 files changed, 365 insertions(+), 49 deletions(-)

diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c
index 632d13c1ea..05d6fcba26 100644
--- a/src/backend/access/common/relation.c
+++ b/src/backend/access/common/relation.c
@@ -21,12 +21,14 @@
 #include "postgres.h"
 
 #include "access/relation.h"
+#include "access/sysattr.h"
 #include "access/xact.h"
 #include "catalog/namespace.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "storage/lmgr.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/syscache.h"
 
 
@@ -215,3 +217,23 @@ relation_close(Relation relation, LOCKMODE lockmode)
 	if (lockmode != NoLock)
 		UnlockRelationId(, lockmode);
 }
+
+/*
+ * Return a bitmapset of attributes given the list of column names
+ */
+Bitmapset *
+get_table_columnset(Oid relid, List *columns, Bitmapset *att_map)
+{
+	ListCell   *cell;
+
+	foreach(cell, columns)
+	{
+		const char *attname = lfirst(cell);
+		int			attnum = get_attnum(relid, attname);
+
+		if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, att_map))
+			att_map = bms_add_member(att_map,
+	 attnum - FirstLowInvalidHeapAttributeNumber);
+	}
+	return att_map;
+}
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2f82..de5f3266cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -50,8 +50,12 @@
  * error if not.
  */
 static void
-check_publication_add_relation(Relation targetrel)
+check_publication_add_relation(Relation targetrel, List *targetcols)
 {
+	bool		replidentfull = (targetrel->rd_rel->relreplident == REPLICA_IDENTITY_FULL);
+	Oid			relid = RelationGetRelid(targetrel);
+	Bitmapset  *idattrs;
+
 	/* Must be a regular or partitioned table */
 	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
 		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
@@ -82,6 +86,36 @@ check_publication_add_relation(Relation targetrel)
  errmsg("cannot add relation \"%s\" to publication",
 		RelationGetRelationName(targetrel)),
  errdetail("This operation is not supported for unlogged tables.")));
+
+	/*
+	 * Cannot specify column filter when REPLICA IDENTITY IS FULL or if column
+	 * filter does not contain REPLICA IDENITY columns
+	 */
+	if (targetcols != NIL)
+	{
+		if (replidentfull)
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("cannot add relation \"%s\" to publication",
+			RelationGetRelationName(targetrel)),
+	 errdetail("Cannot have column filter with REPLICA IDENTITY FULL")));
+		else
+		{
+			Bitmapset  *filtermap = NULL;
+
+			idattrs =

O(n) tasks cause lengthy startups and checkpoints

2021-12-01 Thread Bossart, Nathan

Hi hackers,

Thanks to 61752af, SyncDataDirectory() can make use of syncfs() to
avoid individually syncing all database files after a crash.  However,
as noted earlier this year [0], there are still a number of O(n) tasks
that affect startup and checkpointing that I'd like to improve.
Below, I've attempted to summarize each task and to offer ideas for
improving matters.  I'll likely split each of these into its own
thread, given there is community interest for such changes.

1) CheckPointSnapBuild(): This function loops through
   pg_logical/snapshots to remove all snapshots that are no longer
   needed.  If there are many entries in this directory, this can take
   a long time.  The note above this function indicates that this is
   done during checkpoints simply because it is convenient.  IIUC
   there is no requirement that this function actually completes for a
   given checkpoint.  My current idea is to move this to a new
   maintenance worker.
2) CheckPointLogicalRewriteHeap(): This function loops through
   pg_logical/mappings to remove old mappings and flush all remaining
   ones.  IIUC there is no requirement that the "remove old mappings"
   part must complete for a given checkpoint, but the "flush all
   remaining" portion allows replay after a checkpoint to only "deal
   with the parts of a mapping that have been written out after the
   checkpoint started."  Therefore, I think we should move the "remove
   old mappings" part to a new maintenance worker (probably the same
   one as for 1), and we should consider using syncfs() for the "flush
   all remaining" part.  (I suspect the main argument against the
   latter will be that it could cause IO spikes.)
3) RemovePgTempFiles(): This step can delay startup if there are many
   temporary files to individually remove.  This step is already
   optionally done after a crash via the remove_temp_files_after_crash
   GUC.  I propose that we have startup move the temporary file
   directories aside and create new ones, and then a separate worker
   (probably the same one from 1 and 2) could clean up the old files.
4) StartupReorderBuffer(): This step deletes logical slot data that
   has been spilled to disk.  This code appears to be written to avoid
   deleting different types of files in these directories, but AFAICT
   there shouldn't be any other files.  Therefore, I think we could do
   something similar to 3 (i.e., move the directories aside during
   startup and clean them up via a new maintenance worker).

I realize adding a new maintenance worker might be a bit heavy-handed,
but I think it would be nice to have somewhere to offload tasks that
really shouldn't impact startup and checkpointing.  I imagine such a
process would come in handy down the road, too.  WDYT?

Nathan

[0] https://postgr.es/m/32B59582-AA6C-4609-B08F-2256A271F7A5%40amazon.com

Re: SKIP LOCKED assert triggered

2021-12-01 Thread Simon Riggs

On Wed, 1 Dec 2021 at 14:33, Bossart, Nathan  wrote:
>
> On 11/12/21, 8:56 AM, "Simon Riggs"  wrote:
> > The combination of these two statements in a transaction hits an
> > Assert in heapam.c at line 4770 on REL_14_STABLE
>
> I've been unable to reproduce this.  Do you have any tips for how to
> do so?  Does there need to be some sort of concurrent workload?

That path is only ever taken when there are multiple sessions, and as
I said, pgbench finds this reliably. I guess I didn't say "use -c 2"

-- 
Simon Riggshttp://www.EnterpriseDB.com/

Re: SKIP LOCKED assert triggered

2021-12-01 Thread Bossart, Nathan

On 11/12/21, 8:56 AM, "Simon Riggs"  wrote:
> The combination of these two statements in a transaction hits an
> Assert in heapam.c at line 4770 on REL_14_STABLE

I've been unable to reproduce this.  Do you have any tips for how to
do so?  Does there need to be some sort of concurrent workload?

Nathan

Re: Non-superuser subscription owners

2021-12-01 Thread Mark Dilger

> On Dec 1, 2021, at 5:36 AM, Amit Kapila  wrote:
> 
> On Wed, Dec 1, 2021 at 2:12 AM Jeff Davis  wrote:
>> 
>> On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
>>> I think it would be better to do it before we allow subscription
>>> owners to be non-superusers.
>> 
>> There are a couple other things to consider before allowing non-
>> superusers to create subscriptions anyway. For instance, a non-
>> superuser shouldn't be able to use a connection string that reads the
>> certificate file from the server unless they also have
>> pg_read_server_files privs.
>> 
> 
> Isn't allowing to create subscriptions via non-superusers and allowing
> to change the owner two different things? I am under the impression
> that the latter one is more towards allowing the workers to apply
> changes with a non-superuser role.

The short-term goal is to have logical replication workers respect the 
privileges of the role which owns the subscription.

The long-term work probably includes creating a predefined role with permission 
to create subscriptions, and the ability to transfer those subscriptions to 
roles who might be neither superuser nor members of any particular predefined 
role; the idea being that logical replication subscriptions can be established 
without any superuser involvement, and may thereafter run without any special 
privilege.

The more recent patches on this thread are not as ambitious as the earlier 
patch-sets.  We are no longer trying to support transferring subscriptions to 
non-superusers.

Right now, on HEAD, if a subscription owner has superuser revoked, the 
subscription can continue to operate as superuser in so far as its replication 
actions are concerned.  That seems like a pretty big security hole.

This patch mostly plugs that hole by adding permissions checks, so that a 
subscription owned by a role who has privileges revoked cannot (for the most 
part) continue to act under the old privileges.

There are two problematic edge cases that can occur after transfer of 
ownership.  Remember, the new owner is required to be superuser for the 
transfer of ownership to occur.

1) A subscription is transferred to a new owner, and the new owner then has 
privilege revoked.

2) A subscription is transferred to a new owner, and then the old owner has 
privileges increased.

In both cases, a currently running logical replication worker may finish a 
transaction in progress acting with the current privileges of the old owner.  
The clearest solution is, as you suggest, to refuse transfer of ownership of 
subscriptions that are enabled.

Doing so will create a failure case for REASSIGN OWNED BY.  Will that be ok?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Lots of memory allocated when reassigning Large Objects

2021-12-01 Thread Tom Lane

Justin Pryzby  writes:
> @Guillaume: Even if memory use with the patch isn't constant, I imagine it's
> enough to have avoided OOM.

I think it's good enough in HEAD.  In the back branches, the sinval
queue growth is bad enough that there's still an issue.  Still,
this is a useful improvement, so I added some comments and pushed it.

regards, tom lane

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-01 Thread Bossart, Nathan

On 12/1/21, 8:27 AM, "David Steele"  wrote:
> On 11/30/21 18:31, Bossart, Nathan wrote:
>> Do you think it's still worth trying to make it safe, or do you think
>> we should just remove exclusive mode completely?
>
> My preference would be to remove it completely, but I haven't gotten a
> lot of traction so far.

In this thread, I count 6 people who seem alright with removing it,
and 2 who might be opposed, although I don't think anyone has
explicitly stated they are against it.

Nathan

Fix inappropriate uses of PG_GETARG_UINT32()

2021-12-01 Thread Peter Eisentraut

I noticed that the chr() function uses PG_GETARG_UINT32() to get its 
argument, even though the argument is a (signed) int.  So you get some 
slightly silly behavior like this:


=> select chr(-333);
ERROR:  54000: requested character too large for encoding: -333

The attached patch fixes this by accepting the argument using 
PG_GETARG_INT32(), doing some checks, and then casting it to unsigned 
for the rest of the code.


The patch also fixes another inappropriate use in an example in the 
documentation.  These two were the only inappropriate uses I found, 
after we had fixed a few recently.From 4b14d698486d6c46c3d91c9bf5c380f8acddb095 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 1 Dec 2021 19:18:08 +0100
Subject: [PATCH] Fix inappropriate uses of PG_GETARG_UINT32()

The chr() function used PG_GETARG_UINT32() even though the argument is
declared as (signed) integer.  As a result, you can pass negative
arguments to this function and it internally interprets them as
positive.  Ultimately ends up being harmless, but it seems wrong, so
fix this and rearrange the internal error checking a bit to
accommodate this.

Another case was in the documentation, where example code used
PG_GETARG_UINT32() with an argument declared as signed integer.
---
 doc/src/sgml/xfunc.sgml   |  2 +-
 src/backend/utils/adt/oracle_compat.c | 33 ---
 2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 584d389d45..0d55909734 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3193,7 +3193,7 @@ Returning Sets
 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
 /* total number of tuples to be returned */
-funcctx->max_calls = PG_GETARG_UINT32(0);
+funcctx->max_calls = PG_GETARG_INT32(0);
 
 /* Build a tuple descriptor for our result type */
 if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE)
diff --git a/src/backend/utils/adt/oracle_compat.c 
b/src/backend/utils/adt/oracle_compat.c
index f737aa6fbd..ec99f2b738 100644
--- a/src/backend/utils/adt/oracle_compat.c
+++ b/src/backend/utils/adt/oracle_compat.c
@@ -999,10 +999,26 @@ ascii(PG_FUNCTION_ARGS)
 Datum
 chr(PG_FUNCTION_ARGS)
 {
-   uint32  cvalue = PG_GETARG_UINT32(0);
+   int32   arg = PG_GETARG_INT32(0);
+   uint32  cvalue;
text   *result;
int encoding = GetDatabaseEncoding();
 
+   /*
+* Error out on arguments that make no sense or that we can't validly
+* represent in the encoding.
+*/
+   if (arg < 0)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("character number must be positive")));
+   else if (arg == 0)
+   ereport(ERROR,
+   (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+errmsg("null character not permitted")));
+
+   cvalue = arg;
+
if (encoding == PG_UTF8 && cvalue > 127)
{
/* for Unicode we treat the argument as a code point */
@@ -1017,7 +1033,7 @@ chr   (PG_FUNCTION_ARGS)
if (cvalue > 0x0010)
ereport(ERROR,

(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-errmsg("requested character too large 
for encoding: %d",
+errmsg("requested character too large 
for encoding: %u",
cvalue)));
 
if (cvalue > 0x)
@@ -1058,28 +1074,19 @@ chr (PG_FUNCTION_ARGS)
if (!pg_utf8_islegal(wch, bytes))
ereport(ERROR,

(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-errmsg("requested character not valid 
for encoding: %d",
+errmsg("requested character not valid 
for encoding: %u",
cvalue)));
}
else
{
boolis_mb;
 
-   /*
-* Error out on arguments that make no sense or that we can't 
validly
-* represent in the encoding.
-*/
-   if (cvalue == 0)
-   ereport(ERROR,
-   
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-errmsg("null character not 
permitted")));
-
is_mb = pg_encoding_max_length(encoding) > 1;
 
if ((is_mb && (cvalue > 127)) || (!is_mb && (cvalue > 255)))
ereport(ERROR,

Re: Proposal: allow database-specific role memberships

2021-12-01 Thread Kenaniah Cerny

Thank you for the advice!

Attached is a rebased version of the patch that omits catversion.h in order
to avoid conflicts.

On Wed, Nov 17, 2021 at 6:17 AM Daniel Gustafsson  wrote:

> > On 28 Oct 2021, at 21:39, Kenaniah Cerny  wrote:
>
> > Thank you Asif. A rebased patch is attached.
>
> This patch fails to apply yet again, this time due to a collusion in
> catversion.h.  I think it's fine to omit the change in catversion.h as it's
> likely to repeatedly cause conflicts, and instead just mention it on the
> thread.  Any committer picking it up will know to perform the change
> anyways,
> so leaving it out can keep the patch from conflicting.
>
> --
> Daniel Gustafsson   https://vmware.com/
>
>


database-role-memberships-v4.patch
Description: Binary data

Re: Reserve prefixes for loaded libraries proposal

2021-12-01 Thread Florin Irion

>
> Committed.
>
> I made two notable changes:  I renamed the function, since it looked
> like EmitWarningsOnPlaceholders() but was doing something not analogous.

  Also, you had in your function
>
>  strncmp(varName, var->name, varLen)
>
> probably copied from EmitWarningsOnPlaceholders(), but unlike there, we
> want to compare the whole string here, and this would potentially do
> something wrong if there were a GUC setting that was a substring of the
> name of another one.
>

Yeah, it makes sense!

Thank you very much!
Thank you for the changes and thank you for committing it!

Cheers,
Florin


-- 
*Florin Irion*

*www.enterprisedb.com *
Cel: +39 328 5904901
Tel: +39 0574 054953
Via Alvise Cadamosto, 47
59100, Prato, PO
Italia

Re: Correct handling of blank/commented lines in PSQL interactive-mode history

2021-12-01 Thread Tom Lane

Alvaro Herrera  writes:
> On 2021-Nov-29, Tom Lane wrote:
>> After some further hackery, here's a set of patches that I think
>> might be acceptable.  They're actually fairly independent, although
>> they touch different aspects of the same behavior.

> I tried the collection and I find the overall behavior quite convenient.
> I think it works just as I wish it would.

Hearing no further comments, pushed.

regards, tom lane

Re: RecoveryInProgress() has critical side effects

2021-12-01 Thread Alvaro Herrera

There are a couple typos.  "uninitalized," in one place.

Also,

+* If this is either a bootstrap process nor a standalone backend, start

Is "either .. nor" correct?  It looks very wrong to me.  I think you
meant "either .. or".

Overall, this patch seems mostly about documenting some undesirable
existing behavior.  Is part of your plan to get such things fixed/
replaced entirely?

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"We're here to devour each other alive"(Hobbes)

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-01 Thread David Steele


On 11/30/21 18:31, Bossart, Nathan wrote:

On 11/30/21, 2:58 PM, "David Steele"  wrote:

I did figure out how to keep the safe part of exclusive backup (not
having to maintain a connection) while removing the dangerous part
(writing backup_label into PGDATA), but it was a substantial amount of
work and I felt that it had little chance of being committed.


Do you think it's still worth trying to make it safe, or do you think
we should just remove exclusive mode completely?


My preference would be to remove it completely, but I haven't gotten a 
lot of traction so far.



Attaching the thread [1] that I started with a patch to remove exclusive
backup for reference.


Ah, good, some light reading.  :)


Sure, if you say so!

Regards,
-David

Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

2021-12-01 Thread Alvaro Herrera

On 2021-Dec-01, Bharath Rupireddy wrote:

> The active_pid of ReplicationSlot structure, which tells whether a
> replication slot is active or inactive, isn't persisted to the disk
> i.e has no entry in ReplicationSlotPersistentData structure. Isn't it
> better if we add that info to ReplicationSlotPersistentData structure
> and persist to the disk? This will help to know what were the inactive
> replication slots in case the server goes down or crashes for some
> reason. Currently, we don't have a way to interpret the replication
> slot info in the disk but there's a patch for pg_replslotdata tool at
> [1]. This way, one can figure out the reasons for the server
> down/crash and figure out which replication slots to remove to bring
> the server up and running without touching the other replication
> slots.

I think the PIDs are log-worthy for sure, but it's not clear to me that
it is desirable to write them to the persistent state file.  In case of
crashes, the log should serve just fine to aid root cause investigation
-- in fact even better than the persistent file, where the data would be
lost as soon as the next client acquires that slot.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-01 Thread Andrew Dunstan



On 11/30/21 17:26, Tom Lane wrote:
> "Bossart, Nathan"  writes:
>> It looks like the exclusive way has been marked deprecated in all
>> supported versions along with a note that it will eventually be
>> removed.  If it's not going to be removed out of fear of breaking
>> backward compatibility, I think the documentation should be updated to
>> say that.  However, unless there is something that is preventing users
>> from switching to the non-exclusive approach, I think it is reasonable
>> to begin thinking about removing it.
> If we're willing to outright remove it, I don't have any great objection.
> My original two cents was that we shouldn't put effort into improving it;
> but removing it isn't that.
>
>   


+1


Let's just remove it. We already know it's a footgun, and there's been
plenty of warning.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-01 Thread David Steele


On 11/30/21 19:54, Michael Paquier wrote:

On Tue, Nov 30, 2021 at 05:58:15PM -0500, David Steele wrote:

I did figure out how to keep the safe part of exclusive backup (not having
to maintain a connection) while removing the dangerous part (writing
backup_label into PGDATA), but it was a substantial amount of work and I
felt that it had little chance of being committed.


Which was, I guess, done by storing the backup_label contents within a
file different than backup_label, still maintained in the main data
folder to ensure that it gets included in the backup?


That, or emit it from pg_start_backup() so the user can write it 
wherever they please. That would include writing it into PGDATA if they 
really wanted to, but that would be on them and the default behavior 
would be safe. The problem with this is if the user does not 
rename/supply backup_label on restore then they will get corruption and 
not know it.


Here's another idea. Since the contents of pg_wal are not supposed to be 
copied, we could add a file there to indicate that the cluster should 
remove backup_label on restart. Our instructions also say to remove the 
contents of pg_wal on restore if they were originally copied, so 
hopefully one of the two would happen. But, again, if they fail to 
follow the directions it would lead to corruption.


Order would be important here. When starting the backup the proper order 
would be to write pg_wal/backup_in_progress and then backup_label. When 
stopping the backup they would be removed in the reverse order.


On a restart if both are present then delete both in the correct order 
and start crash recovery using the info in pg_control. If only 
backup_label is present then go into recovery using the info from 
backup_label.


It's possible for pg_wal/backup_in_process to be present by itself if 
the server crashes after deleting backup_label but before deleting 
pg_wal/backup_in_progress. In that case the server should simply remove 
it on start and go into crash recovery using the info from pg_control.


The advantage of this idea is that it does not change the current 
instructions as far as I can see. If the user is already following them, 
they'll be fine. If they are not, then they'll need to start doing so.


Of course, none of this affects users who are using non-exclusive 
backup, which I do hope covers the majority by now.


Thoughts?

Regards,
-David

Re: increase size of pg_commit_ts buffers

2021-12-01 Thread Alvaro Herrera

On 2021-Nov-30, Simon Riggs wrote:

> On Fri, 12 Nov 2021 at 17:39, Tomas Vondra
>  wrote:
> 
> > So +1 to just get this committed, as it is.
> 
> This is a real issue affecting Postgres users. Please commit this soon.

Uh ouch, I had forgotten that this patch was mine (blush).  Thanks for
the ping, I pushed it yesterday.  I added a comment.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Non-decimal integer literals

2021-12-01 Thread Peter Eisentraut


On 25.11.21 18:51, John Naylor wrote:
If we're going to change the comment anyway, "the parser" sounds more 
natural. Aside from that, 0001 and 0002 can probably be pushed now, if 
you like.


done


--- a/src/interfaces/ecpg/preproc/pgc.l
+++ b/src/interfaces/ecpg/preproc/pgc.l
@@ -365,6 +365,10 @@ real ({integer}|{decimal})[Ee][-+]?{digit}+
  realfail1 ({integer}|{decimal})[Ee]
  realfail2 ({integer}|{decimal})[Ee][-+]

+integer_junk {integer}{ident_start}
+decimal_junk {decimal}{ident_start}
+real_junk {real}{ident_start}

A comment might be good here to explain these are only in ECPG for 
consistency with the other scanners. Not really important, though.


Yeah, it's a bit weird that not all the symbols are used in ecpg.  I'll 
look into explaining this better.



0006

+{hexfail} {
+ yyerror("invalid hexadecimal integer");
+ }
+{octfail} {
+ yyerror("invalid octal integer");
   }
-{decimal} {
+{binfail} {
+ yyerror("invalid binary integer");
+ }

It seems these could use SET_YYLLOC(), since the error cursor doesn't 
match other failure states:


ok

We might consider some tests for ECPG since lack of coverage has been a 
problem.


right

Also, I'm curious: how does the spec work as far as deciding the year of 
release, or feature-freezing of new items?


The schedule has recently been extended again, so the current plan is 
for SQL:202x with x=3, with feature freeze in mid-2022.


So the feature patches in this thread are in my mind now targeting 
PG15+1.  But the preparation work (up to v5-0005, and some other number 
parsing refactoring that I'm seeing) could be considered for PG15.


I'll move this to the next CF and come back with an updated patch set in 
a little while.

Re: Update stale code comment in CheckpointerMain()

2021-12-01 Thread Robert Haas

On Wed, Dec 1, 2021 at 8:24 AM Daniel Gustafsson  wrote:
> >>> The attached patch updates the code comment which is no longer true
> >>> after commit # 4a92a1c3d1c361ffb031ed05bf65b801241d7cdd
> >> Agreed, but looking at this shouldn't we also tweak the comment on
> >> RecoveryInProgress() as per the attached v2 diff?
> > Yes, we should -- diff looks good to me, thanks.
> Thanks for confirming, I've applied this to master.

Thanks both of you.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Non-decimal integer literals

2021-12-01 Thread Peter Eisentraut


On 25.11.21 16:46, Zhihong Yu wrote:

For patch 3,

+int64
+pg_strtoint64(const char *s)

How about naming the above function pg_scanint64()?
pg_strtoint64xx() can be named pg_strtoint64() - this would align with 
existing function:


pg_strtouint64(const char *str, char **endptr, int base)


That would be one way.  But the existing pg_strtointNN() functions are 
pretty widely used, so I would tend toward finding another name for the 
less used pg_strtouint64(), maybe pg_strtouint64x() ("extended").

Re: RecoveryInProgress() has critical side effects

2021-12-01 Thread Robert Haas

On Sat, Nov 20, 2021 at 3:22 PM Heikki Linnakangas  wrote:
> But here's yet another idea: We could initialize RedoRecPtr and
> doPageWrites in InitXLogInsert(). It would seem like a pretty natural
> place for it.

I considered that, but it seemed like an abstraction violation to me,
because that code is part of xloginsert.c, which is mostly concerned
with assembly of write-ahead log records, not the associated
shared-memory state. Also, I don't think there's any guarantee that
the state in shared memory is initialized at the time we call that
function, so we might just be copying uninitialized memory into other
uninitialized memory.

> PS. typo in patch v2: s/prepard/prepared/

Thanks, fixed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

v3-0001-Remove-InitXLOGAccess.patch
Description: Binary data

Re: Can I assume relation would not be invalid during from ExecutorRun to ExecutorEnd

2021-12-01 Thread Robert Haas

On Tue, Nov 30, 2021 at 7:50 PM Andy Fan  wrote:
> I think you misunderstand me,  I argued with the answer because after I got 
> the
> answer and I rethink my problem, I found my question description is not 
> accurate
> enough,  so I improved the question and willing discussion again. My 
> exception was
> things will continue with something like this:
> 1. In your new described situation,  your solution still does not work 
> because ...
> 2. In your new described situation,  the solution would work for sure
> 3.  your situation is still not cleared enough.

I mean, it's clear to me that you can make a new table get opened at
any time in the execution of a query. Just use CASE or an IF statement
inside of a stored procedure to do nothing for the first  calls
and then on call 1 access a new relation. And as soon as that
happens, you can AcceptInvalidationMessages(), which can cause
relcache entries to be destroyed or rebuilt. If the relation is open,
the relcache entry can't be destroyed altogether, but it can be
rebuilt: see RelationClearRelation(). Whether that's a problem for
what you are doing I don't know. But the overall point is that access
to a new relation can happen at any point in the query -- and as soon
as it does, we will accept ALL pending invalidation messages for ALL
relations regardless of what locks anyone holds on anything.

So it's generally a mistake to assume that relcache entries are
"stable" across large stretches of code. They are in fact stable in a
certain sense - if we have the relation open, we hold a reference
count on it, and so the Relation pointer itself will remain valid. But
the data it points to can change in various ways, and different
members of the RelationData struct are handled differently. Separately
from the reference count, the heavyweight lock that we also hold on
the relation as a condition of opening it prevents certain kinds of
changes, so that even if the relation cache entry is rebuilt, certain
particular fields will be unaffected. Which fields are protected in
this way will depend on what kind of lock is held. It's hard to speak
in general terms. The best advice I can give you is (1) look exactly
what RelationClearRelation() is going to do to the fields you care
about if a rebuild happens, (2) err on the side of assuming that
things can change under you, and (3) try running your code under
debug_discard_caches = 1. It will be slow that way, but it's pretty
effective in finding places where you've made unsafe assumptions.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: SQL/JSON: functions

2021-12-01 Thread Andrew Dunstan

On 12/1/21 06:13, Himanshu Upadhyaya wrote:
> Hi Andrew,
>
> The latest version (v59) is not applying on head.
> Could you please help to rebase?
>
>

(Please don't top-post on PostgreSQL lists)

The patches apply for me and for the cfbot:
. I'm not sure what's not
working for you. I apply them using "patch -p 1 < $patchfile"

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Reserve prefixes for loaded libraries proposal

2021-12-01 Thread Peter Eisentraut


On 07.10.21 14:03, Florin Irion wrote:

I adjusted a bit the code and configured my mail client to
send patch attachments appropiately(hopefully). :)

So here is my second version.


Committed.

I made two notable changes:  I renamed the function, since it looked 
like EmitWarningsOnPlaceholders() but was doing something not analogous. 
 Also, you had in your function


strncmp(varName, var->name, varLen)

probably copied from EmitWarningsOnPlaceholders(), but unlike there, we 
want to compare the whole string here, and this would potentially do 
something wrong if there were a GUC setting that was a substring of the 
name of another one.

Re: [PATCH] Partial foreign key updates in referential integrity triggers

2021-12-01 Thread Matthias van de Meent

On Wed, 1 Dec 2021 at 11:33, Peter Eisentraut
 wrote:
>
> On 23.11.21 05:44, Paul Martinez wrote:
> > Updated patch attached. Thanks for taking a look so quickly!
>
> This patch looks pretty much okay to me.  As I wrote in another message
> in this thread, I'm having some doubts about the proper use case.  So
> I'm going to push this commit fest entry to the next one, so we can
> continue that discussion.

The use case of the original mail "foreign keys are guaranteed to not
be cross-tenant" seems like a good enough use case to me?

The alternative to the discriminator column approach to seperating
tenant data even when following referential integrety checks would be
maintaining a copy of the table for each tenant, but this won't work
as well due to (amongst others) syscache bloat, prepared statements
being significantly less effective, and DDL operations now growing
linearly with the amount of tenants in the system.

Kind regards,

Matthias van de Meent

Re: stat() vs ERROR_DELETE_PENDING, round N + 1

2021-12-01 Thread Daniel Gustafsson

> On 3 Nov 2021, at 12:02, Daniel Gustafsson  wrote:
> 
> This patch doesn't compile on Windows according to Appveyor, seemingly because
> of a syntax error in the new win32ntdll.h file, but the MSVC logs are hard on
> the eye so it might be unrelated.

As the thread has stalled with a patch that doesn't apply, I'm marking this
patch Returned with Feedback.  Please feel free to resubmit when a new patch is
ready.

--
Daniel Gustafsson   https://vmware.com/

Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

2021-12-01 Thread Bharath Rupireddy

Hi,

The active_pid of ReplicationSlot structure, which tells whether a
replication slot is active or inactive, isn't persisted to the disk
i.e has no entry in ReplicationSlotPersistentData structure. Isn't it
better if we add that info to ReplicationSlotPersistentData structure
and persist to the disk? This will help to know what were the inactive
replication slots in case the server goes down or crashes for some
reason. Currently, we don't have a way to interpret the replication
slot info in the disk but there's a patch for pg_replslotdata tool at
[1]. This way, one can figure out the reasons for the server
down/crash and figure out which replication slots to remove to bring
the server up and running without touching the other replication
slots.

Thoughts?

[1] - 
https://www.postgresql.org/message-id/CALj2ACW0rV5gWK8A3m6_X62qH%2BVfaq5hznC%3Di0R5Wojt5%2Byhyw%40mail.gmail.com

Regards,
Bharath Rupireddy.

Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-01 Thread Bharath Rupireddy

On Wed, Dec 1, 2021 at 6:45 PM Tom Lane  wrote:
>
> Justin Pryzby  writes:
> > +1 to document it, but it seems like the worse problem is allowing the 
> > admin to
> > write a configuration which causes the server to fail to start, without 
> > having
> > issued a warning.
>
> > I think you could fix that with a GUC check hook to emit a warning.
> > I'm not sure what objections people might have to this.  Maybe it's 
> > confusing
> > to execute preliminary verification of the library by calling stat() but 
> > not do
> > stronger verification for other reasons the library might fail to load.  
> > Like
> > it doesn't have the right magic number, or it's built for the wrong server
> > version.  Should factor out the logic from internal_load_library and check
> > those too ?
>
> Considering the vanishingly small number of actual complaints we've
> seen about this, that sounds ridiculously over-engineered.
> A documentation example should be sufficient.

Thanks. Here's the v1 patch adding examples in the documentation.

Regards,
Bharath Rupireddy.


v1-0001-document-examples-of-setting-up-multiple-values-f.patch
Description: Binary data

Re: Non-superuser subscription owners

2021-12-01 Thread Amit Kapila

On Wed, Dec 1, 2021 at 2:12 AM Jeff Davis  wrote:
>
> On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
> > I think it would be better to do it before we allow subscription
> > owners to be non-superusers.
>
> There are a couple other things to consider before allowing non-
> superusers to create subscriptions anyway. For instance, a non-
> superuser shouldn't be able to use a connection string that reads the
> certificate file from the server unless they also have
> pg_read_server_files privs.
>

Isn't allowing to create subscriptions via non-superusers and allowing
to change the owner two different things? I am under the impression
that the latter one is more towards allowing the workers to apply
changes with a non-superuser role.

-- 
With Regards,
Amit Kapila.

Re: Update stale code comment in CheckpointerMain()

2021-12-01 Thread Daniel Gustafsson

> On 1 Dec 2021, at 07:19, Amul Sul  wrote:
> 
> On Tue, Nov 30, 2021 at 3:09 PM Daniel Gustafsson  wrote:
>> 
>>> On 30 Nov 2021, at 08:00, Amul Sul  wrote:
>> 
>>> The attached patch updates the code comment which is no longer true
>>> after commit # 4a92a1c3d1c361ffb031ed05bf65b801241d7cdd
>> 
>> Agreed, but looking at this shouldn't we also tweak the comment on
>> RecoveryInProgress() as per the attached v2 diff?
> 
> Yes, we should -- diff looks good to me, thanks.

Thanks for confirming, I've applied this to master.

--
Daniel Gustafsson   https://vmware.com/

Re: Optionally automatically disable logical replication subscriptions on error

2021-12-01 Thread Amit Kapila

On Wed, Dec 1, 2021 at 5:55 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Wednesday, December 1, 2021 3:02 PM vignesh C  wrote:
> > On Tue, Nov 30, 2021 at 5:34 PM osumi.takami...@fujitsu.com
> >  wrote:
>
> > 3) Can add a line in the commit message saying "Bump catalog version."
> > as the patch involves changing the catalog.
> Hmm, let me postpone this fix till the final version.
> The catalog version gets easily updated by other patch commits
> and including it in the middle of development can become
> cause of conflicts of my patch when applied to the PG,
> which is possible to make other reviewers stop reviewing.
>

Vignesh seems to be suggesting just changing the commit message, not
the actual code. This is sort of a reminder to the committer to change
the catversion before pushing the patch. So that shouldn't cause any
conflicts while applying your patch.

-- 
With Regards,
Amit Kapila.

Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-01 Thread Tom Lane

Justin Pryzby  writes:
> +1 to document it, but it seems like the worse problem is allowing the admin 
> to
> write a configuration which causes the server to fail to start, without having
> issued a warning.

> I think you could fix that with a GUC check hook to emit a warning.
> I'm not sure what objections people might have to this.  Maybe it's confusing
> to execute preliminary verification of the library by calling stat() but not 
> do
> stronger verification for other reasons the library might fail to load.  Like
> it doesn't have the right magic number, or it's built for the wrong server
> version.  Should factor out the logic from internal_load_library and check
> those too ?

Considering the vanishingly small number of actual complaints we've
seen about this, that sounds ridiculously over-engineered.
A documentation example should be sufficient.

regards, tom lane

Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-01 Thread Justin Pryzby

On Wed, Dec 01, 2021 at 04:20:52PM +0530, Bharath Rupireddy wrote:
> It seems like users can try different ways to set multiple values for
> shared_preload_libraries GUC even after reading the documentation
> [1]), something like:
...
> ALTER SYSTEM SET shared_preload_libraries = 
> "auth_delay,pg_stat_statements,sepgsql"; --> wrong
> 
> The problem with the wrong parameter set command is that the ALTER
> SYSTEM SET will not fail, but the server will not come up in case it
> is restarted. In various locations in the documentation, we have shown
> how a single value can be set, something like:
> shared_preload_libraries = 'auth_delay'
> shared_preload_libraries = 'pg_stat_statements'
> shared_preload_libraries = 'sepgsql'
> 
> Isn't it better we document (in [1]) an example to set multiple values
> to shared_preload_libraries?

+1 to document it, but it seems like the worse problem is allowing the admin to
write a configuration which causes the server to fail to start, without having
issued a warning.

I think you could fix that with a GUC check hook to emit a warning.
I'm not sure what objections people might have to this.  Maybe it's confusing
to execute preliminary verification of the library by calling stat() but not do
stronger verification for other reasons the library might fail to load.  Like
it doesn't have the right magic number, or it's built for the wrong server
version.  Should factor out the logic from internal_load_library and check
those too ?

-- 
Justin

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-12-01 Thread Dilip Kumar

On Wed, Dec 1, 2021 at 12:07 PM Greg Nancarrow  wrote:
>
> On Thu, Nov 25, 2021 at 10:17 PM Dilip Kumar  wrote:
> >
> > Thanks for the review and many valuable comments, I have fixed all of
> > them except this comment (/* If we got a cancel signal during the copy
> > of the data, quit */) because this looks fine to me.  0007, I have
> > dropped from the patchset for now.  I have also included fixes for
> > comments given by John.
> >
>
> I found the following issue with the patches applied:
>
> A server crash occurs after the following sequence of commands:
>
> create tablespace tbsp1 location '/tbsp1';
> create tablespace tbsp2 location '/tbsp2';
> create database test1 tablespace tbsp1;
> create database test2 template test1 tablespace tbsp2;
> alter database test2 set tablespace tbsp1;
> checkpoint;
>
> The following type of message is seen in the server log:
>
> 2021-12-01 16:48:26.623 AEDT [67423] PANIC:  could not fsync file
> "pg_tblspc/16385/PG_15_202111301/16387/3394": No such file or
> directory

Thanks a lot for testing this. From the error, it seems like some of
the old buffer w.r.t. the previous tablespace is not dropped after the
movedb.  Actually, we are calling DropDatabaseBuffers() after copying
to a new tablespace and dropping all the buffers of this database
w.r.t the old tablespace.  But seems something is missing, I will
reproduce this and try to fix it by tomorrow.  I will also fix the
other review comments raised by you in the previous mail.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

RE: Optionally automatically disable logical replication subscriptions on error

2021-12-01 Thread osumi.takami...@fujitsu.com

On Wednesday, December 1, 2021 3:02 PM vignesh C  wrote:
> On Tue, Nov 30, 2021 at 5:34 PM osumi.takami...@fujitsu.com
>  wrote:
> >
> > On Tuesday, November 30, 2021 1:10 PM Greg Nancarrow
>  wrote:
> > > On Sat, Nov 27, 2021 at 1:36 AM osumi.takami...@fujitsu.com
> > >  wrote:
> > > >
> > > > This v7 uses v26 of skip xid patch [1]
> > > This patch no longer applies on the latest source.
> > > Also, the patch is missing an update to doc/src/sgml/catalogs.sgml,
> > > for the new "subdisableonerr" column of pg_subscription.
> > Thanks for your review !
> >
> > Fixed the documentation accordingly. Further, this comment invoked
> > some more refactoring of codes since I wrote some internal codes
> > related to 'disable_on_error' in an inconsistent order.
> > I fixed this by keeping patch's codes
> > after that of 'two_phase' subscription option as much as possible.
> >
> > I also conducted both pgindent and pgperltidy.
> >
> > Now, I'll share the v8 that uses PG
> > whose commit id is after 8d74fc9 (pg_stat_subscription_workers).
> 
> Thanks for the updated patch, few small comments:
I appreciate your check.

> 1) This should be changed:
> +   subdisableonerr bool
> +  
> +  
> +   If true, the subscription will be disabled when subscription
> +   worker detects an error
> +  
> + 
> 
> to:
> +   subdisableonerr bool
> +  
> +  
> +   If true, the subscription will be disabled when subscription
> +   worker detects non-transient errors
> +  
> + 
Fixed. Actually, there's no clear definition what "non-transient" means
in the documentation. So, I added some words to your suggestion,
which would give clearer understanding to users.

> 2) "Disable On Err" can be changed to "Disable On Error"
> + ",
> subtwophasestate AS \"%s\"\n"
> + ",
> subdisableonerr AS \"%s\"\n",
> +
> gettext_noop("Two phase commit"),
> +
> gettext_noop("Disable On Err"));
Fixed.

> 3) Can add a line in the commit message saying "Bump catalog version."
> as the patch involves changing the catalog.
Hmm, let me postpone this fix till the final version.
The catalog version gets easily updated by other patch commits
and including it in the middle of development can become
cause of conflicts of my patch when applied to the PG,
which is possible to make other reviewers stop reviewing.

> 4) This prototype is not required, since the function is called after the 
> function
> definition:
>  static inline void set_apply_error_context_xact(TransactionId xid,
> TimestampTz ts);  static inline void reset_apply_error_context_info(void);
> +static bool IsTransientError(ErrorData *edata);
Fixed.

> 5) we could use the new style here:
> +   ereport(LOG,
> +   (errmsg("logical replication subscription
> \"%s\" will be disabled due to error: %s",
> +   MySubscription->name,
> + edata->message)));
> 
> change it to:
> +   ereport(LOG,
> +   errmsg("logical replication subscription
> \"%s\" will be disabled due to error: %s",
> +   MySubscription->name,
> + edata->message));
> 
> Similarly it can be changed in the other ereports added.
Removed the unnecessary parentheses.

Best Regards,
Takamichi Osumi



v9-0001-Optionally-disable-subscriptions-on-error.patch
Description: v9-0001-Optionally-disable-subscriptions-on-error.patch

Re: Commitfest 2021-11 Patch Triage - Part 1

2021-12-01 Thread Marcos Pegoraro

>
> I think the reason why we can't update a materialized view directly is
> because
> it is basically a "view" and it should not contains any data irrelevant to
> its
> definition and underlying tables. If we would have a feature to update a
> materialized view direcly,  maybe, it should behave as updatable-view as
> well
> as normal (virtual) views, although I am not sure
>

Well, I didn´t find any place where is detailed why those tables are not
updatable.
And would be fine to be updated through triggers or cron jobs until IVM is
available.
CheckValidRowMarkRel just gives an exception "cannot lock rows in
materialized view ...", but why ?
What are the differences between Materialized Views and tables ?

Re: GUC flags

2021-12-01 Thread Michael Paquier

On Wed, Dec 01, 2021 at 01:59:05AM -0600, Justin Pryzby wrote:
> On Tue, Nov 30, 2021 at 03:36:45PM +0900, Michael Paquier wrote:
>> -gettext_noop("Waits N seconds on connection startup 
>> before authentication."),
>> +gettext_noop("Sets the amount of seconds to wait on 
>> connection "
>> + "startup before 
>> authentication."),
> 
> same

Thanks.  This makes things more consistent.

>>  {
>>  {"checkpoint_warning", PGC_SIGHUP, WAL_CHECKPOINTS,
>> -gettext_noop("Enables warnings if checkpoint segments 
>> are filled more "
>> - "frequently than this."),
>> +gettext_noop("Sets the maximum time before warning if 
>> checkpoints "
>> + "triggered by WAL volume 
>> happen too frequently."),
>>  gettext_noop("Write a message to the server log if 
>> checkpoints "
>> - "caused by the filling of 
>> checkpoint segment files happens more "
>> + "caused by the filling of WAL 
>> segment files happens more "
> 
> It should say "happen" , since it's referring to "checkpoints".
> That was a pre-existing issue.

Indeed.

>>  {"log_parameter_max_length", PGC_SUSET, LOGGING_WHAT,
>> -gettext_noop("When logging statements, limit logged 
>> parameter values to first N bytes."),
>> +gettext_noop("Sets the maximum amount of data logged 
>> for bind "
>> + "parameter values when logging 
>> statements."),
> 
> I think this one should actually say "in bytes" or at least say "maximum
> length".  It seems unlikely that someone is going to specify this in other
> units, and it's confusing to everyone else to refer to "amount of data" 
> instead
> of "length in bytes".

Okay.  Do you like the updated version attached?

>> -gettext_noop("Automatic log file rotation will occur 
>> after N minutes."),
>> +gettext_noop("Sets the maximum amount of time to wait 
>> before "
>> + "forcing log file rotation."),
> 
> Should it say "maximum" ?  Does that mean anything ?

To be consistent with the rest of your suggestions, we could use here:
"Sets the amount of time to wait before forcing log file rotation"

Thanks,
--
Michael
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 25ac4ca85b..003d649ff5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2131,8 +2131,8 @@ static struct config_int ConfigureNamesInt[] =
 {
 	{
 		{"archive_timeout", PGC_SIGHUP, WAL_ARCHIVING,
-			gettext_noop("Forces a switch to the next WAL file if a "
-		 "new file has not been started within N seconds."),
+			gettext_noop("Sets the amount of time to wait before forcing a "
+		 "switch to the next WAL file."),
 			NULL,
 			GUC_UNIT_S
 		},
@@ -2142,7 +2142,8 @@ static struct config_int ConfigureNamesInt[] =
 	},
 	{
 		{"post_auth_delay", PGC_BACKEND, DEVELOPER_OPTIONS,
-			gettext_noop("Waits N seconds on connection startup after authentication."),
+			gettext_noop("Sets the amount of time to wait on connection "
+		 "startup after authentication."),
 			gettext_noop("This allows attaching a debugger to the process."),
 			GUC_NOT_IN_SAMPLE | GUC_UNIT_S
 		},
@@ -2762,7 +2763,8 @@ static struct config_int ConfigureNamesInt[] =
 	{
 		/* Not for general use */
 		{"pre_auth_delay", PGC_SIGHUP, DEVELOPER_OPTIONS,
-			gettext_noop("Waits N seconds on connection startup before authentication."),
+			gettext_noop("Sets the amount of time to wait on connection "
+		 "startup before authentication."),
 			gettext_noop("This allows attaching a debugger to the process."),
 			GUC_NOT_IN_SAMPLE | GUC_UNIT_S
 		},
@@ -2819,10 +2821,10 @@ static struct config_int ConfigureNamesInt[] =
 
 	{
 		{"checkpoint_warning", PGC_SIGHUP, WAL_CHECKPOINTS,
-			gettext_noop("Enables warnings if checkpoint segments are filled more "
-		 "frequently than this."),
+			gettext_noop("Sets the maximum time before warning if checkpoints "
+		 "triggered by WAL volume happen too frequently."),
 			gettext_noop("Write a message to the server log if checkpoints "
-		 "caused by the filling of checkpoint segment files happens more "
+		 "caused by the filling of WAL segment files happen more "
 		 "frequently than this number of seconds. Zero turns off the warning."),
 			GUC_UNIT_S
 		},
@@ -3006,7 +3008,8 @@ static struct config_int ConfigureNamesInt[] =
 
 	{
 		{"log_parameter_max_length", PGC_SUSET, LOGGING_WHAT,
-			gettext_noop("When logging statements, limit logged parameter values to first N bytes."),
+			gettext_noop("Sets the maximum length in bytes of data logged for bind "
+

Re: Data is copied twice when specifying both child and parent table in publication

2021-12-01 Thread Amit Kapila

On Mon, Nov 29, 2021 at 2:21 PM houzj.f...@fujitsu.com
 wrote:
>
> On Thursday, November 11, 2021 2:53 PM houzj.f...@fujitsu.com wrote:
> > Attach the fix patch.
> > 0001 fix data double publish(first issue in this thread)
>
> In another thread[1], Amit L suggested that it'd be nice to add a testcase in
> src/test/subscription/. So, attach a new version patch which add a testcase in
> t/013_partition.pl.
>

Thanks, your patch looks good to me. I have slightly changed the
comments and commit message in the attached.

I think we should back-patch this but I am slightly worried that if
someone is dependent on the view pg_publication_tables to return both
parent and child tables for publications that have both of those
tables and published with publish_via_partition_root as true then this
might break his usage. But OTOH, I don't see why someone would do like
that and she might face some problems like what we are trying to solve
here.

Thoughts?

-- 
With Regards,
Amit Kapila.

v7-0001-Fix-double-publish-of-child-table-s-data.patch
Description: Binary data

Re: SQL/JSON: functions

2021-12-01 Thread Himanshu Upadhyaya

Hi Andrew,

The latest version (v59) is not applying on head.
Could you please help to rebase?

Thanks,
Himanshu

On Thu, Sep 16, 2021 at 8:23 PM Andrew Dunstan  wrote:

>
> On 9/14/21 8:55 AM, Andrew Dunstan wrote:
> > On 9/2/21 2:50 PM, Andrew Dunstan wrote:
> >> On 5/18/21 3:22 PM, Andrew Dunstan wrote:
> >>> On 5/8/21 2:21 PM, Andrew Dunstan wrote:
>  On 4/28/21 5:55 PM, Andrew Dunstan wrote:
> > On Fri, Mar 26, 2021 at 9:14 PM Nikita Glukhov
> > mailto:n.glu...@postgrespro.ru>> wrote:
> >
> > Attached 54th version of the patches rebased onto current master.
> >
> > On 27.03.2021 01:30, Andrew Dunstan wrote:
> >> Specifically, patch 4 (SQL-JSON-query-functions) fails with
> this when
> >> built with LLVM:
> >>
> >>
> >> There is also a bug that results in a warning in gram.y, but
> fixing it
> >> doesn't affect this issue. Nikita, please look into this ASAP.
> > LLVM issues and gram.y are fixed.
> >
> >
> >
> >
> > It's apparently bitrotted again. See
> >  > >
> >
> >
>  This set should remove the bitrot.
> 
> 
> >>> Rebased for removal of serial schedule
> >>>
> >>>
> >> rebased on master and incorporating fixes from Erik Rijkers
> >>
> >>
> > rebased to remove bitrot from the removal of the  Value node type.
> >
> >
>
> Rebased, and fixed a bug (which I had faithfully replicated above) in
> the APP_JUMB code, as reported by Erik Rijkers.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>

Re: Replication protocol doc fix

2021-12-01 Thread Daniel Gustafsson

> On 3 Nov 2021, at 12:14, Daniel Gustafsson  wrote:
> 
> The commitfest CI times out on all platforms and never finishes when running
> make check with this patch, so unless the patch is dropped due to concerns
> already raised then that seems like a good thing to fix.

As the thread has stalled, I'm marking this Returned with Feedback.  Please
feel free to resubmit when/if a new patch is available.

--
Daniel Gustafsson   https://vmware.com/

Re: More time spending with "delete pending"

2021-12-01 Thread Daniel Gustafsson

> On 13 Jul 2021, at 20:00, Alexander Lakhin  wrote:
> 
> Hello Michael,
> 12.07.2021 08:52, Michael Paquier wrote:
>> On Mon, Jul 12, 2021 at 02:09:41PM +0900, Michael Paquier wrote:
>> 
>>> And fairywren, that uses MinGW, is unhappy:
>>> 
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-07-12%2004%3A47%3A13
>>> 
>>> Looking at it now.
>>> 
>> I am not sure what to do here to cool down MinGW, so the patch has
>> been reverted for now:
>> - Plain stat() is not enough to do a proper detection of the files
>> pending for deletion on MSVC, so reverting only the part with
>> microsoft_native_stat() is not going to help.
>> - We are going to need an evaluation of the stat() implementation in
>> MinGW and check if is it possible to rely on it more.  If yes, we
>> could get away with more tweaks based on __MINGW32__/__MINGW64__.
>> 
> I've managed to adapt open/win32stat for MinGW by preventing stat overriding 
> at the implementation level. Thus, the code that calls stat() will use the 
> overridden struct/function(s), but inside of open/win32_stat we could access 
> original stat and reference to __pgstat64 where we want to use our.
> Tested on MSVC, MinGW64 and MinGW32 — the fixed code compiles everywhere.
> 
> But when using perl from msys (not ActivePerl) while testing I've got the 
> same test failure due to another error condition:
> 
> 
> 
> In this case CreateFile() fails with ERROR_SHARING_VIOLATION (not 
> ERROR_ACCESS_DENIED) and it leads to the same "Permission denied" error. This 
> condition is handled in the pgwin32_open() but not in _pgstat64() yet.
> 
> I think I should develop another test for MSVC environment that will 
> demonstrate a such failure too.
> But as it's not related to the DELETE_PENDING state, I wonder whether this 
> should be fixed in a separate patch.

Michael, have you had a chance to look at the updated patch here?  I'm moving
this patch entry from Waiting on Author to Needs Review.

--
Daniel Gustafsson   https://vmware.com/

should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-01 Thread Bharath Rupireddy

Hi,

It seems like users can try different ways to set multiple values for
shared_preload_libraries GUC even after reading the documentation
[1]), something like:
ALTER SYSTEM SET shared_preload_libraries TO
auth_delay,pg_stat_statements,sepgsql; --> correct
ALTER SYSTEM SET shared_preload_libraries =
'auth_delay','pg_stat_statements','sepgsql'; --> correct
ALTER SYSTEM SET shared_preload_libraries TO
'auth_delay','pg_stat_statements','sepgsql'; --> correct
ALTER SYSTEM SET shared_preload_libraries =
auth_delay,pg_stat_statements,sepgsql;  --> wrong
ALTER SYSTEM SET shared_preload_libraries =
'auth_delay,pg_stat_statements,sepgsql';  --> wrong
ALTER SYSTEM SET shared_preload_libraries =
"auth_delay,pg_stat_statements,sepgsql"; --> wrong

The problem with the wrong parameter set command is that the ALTER
SYSTEM SET will not fail, but the server will not come up in case it
is restarted. In various locations in the documentation, we have shown
how a single value can be set, something like:
shared_preload_libraries = 'auth_delay'
shared_preload_libraries = 'pg_stat_statements'
shared_preload_libraries = 'sepgsql'

Isn't it better we document (in [1]) an example to set multiple values
to shared_preload_libraries? If okay, we can provide examples to other
GUCs local_preload_libraries and session_preload_libraries, but I'm
not in favour of it.

Thoughts?

[1] - 
https://www.postgresql.org/docs/devel/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES

Regards,
Bharath Rupireddy.

Re: Add option --drop-cascade for pg_dump/restore

2021-12-01 Thread Daniel Gustafsson

> On 3 Nov 2021, at 20:03, Tom Lane  wrote:
> 
> Wu Haotian  writes:
>> here's the rebased patch.
> 
> Looks like it needs rebasing again, probably as a result of our recent
> renaming of our Perl test modules.

As this patch hasn't been updated, I'm marking this entry Returned with
Feedback.  Please feel free to open a new entry when a rebased patch is
available.

> FWIW, I'd strongly recommend that it's time to pull all that SQL code
> hacking out of RestoreArchive and put it in its own function.

+1

--
Daniel Gustafsson   https://vmware.com/

1 2 >

1 - 100 of 105 matches

Mail list logo