Re: Remove useless arguments in ReadCheckpointRecord().

2022-07-20 Thread Kyotaro Horiguchi
I agree to removing the two parameters. And agree to let
ReadCheckpointRecord not conscious of the location source.

At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao  
wrote in 
> Agreed. Attached is the updated version of the patch.
> Thanks for the review!

-   (errmsg("could not locate required checkpoint record"),
+   (errmsg("could not locate a valid checkpoint record in backup_label 
file"),

"in backup_label" there looks *to me* need some verb..  By the way,
this looks like a good chance to remove the (now) extra parens around
errmsg() and friends.

For example:

-   (errmsg("could not locate a valid checkpoint record in backup_label 
file"),
+   errmsg("could not locate checkpoint record spcified in backup_label 
file"),

-   (errmsg("could not locate a valid checkpoint record in control file")));
+   errmsg("could not locate checkpoint record recorded in control file")));


+   (errmsg("invalid checkpoint record")));

Is it useful to show the specified LSN there?

+   (errmsg("invalid resource manager ID in 
checkpoint record")));

We have a similar message "invalid resource manager ID %u at
%X/%X". Since the caller explains that it is a checkpoint record, we
can share the message here.

+   (errmsg("invalid xl_info in checkpoint 
record")));

(It is not an issue of this patch, though) I don't think this is
appropriate for user-facing message. Counldn't we say "unexpected
record type: %x" or something like?

+   (errmsg("invalid length of checkpoint 
record")));

We have "record with invalid length at %X/%X" or "invalid record
length at %X/%X: wanted %u, got %u". Counld we reuse any of them?

> > May be unrelated, IIRC, for the errors like ereport(PANIC,
> > (errmsg("could not locate a valid checkpoint record"))); we wanted to
> > add a hint asking users to consider running pg_resetwal to fix the
> > issue. The error for ReadCheckpointRecord() in backup_label file
> > cases, already gives a hint errhint("If you are restoring from a
> > backup, touch \"%s/recovery.signal\" .. Adding the hint of running
> > pg_resetwal (of course with a caution that it can cause inconsistency
> > in the data and use it as a last resort as described in the docs)
> > helps users and support engineers a lot to mitigate the server down
> > cases quickly.
> 
> +1 to add some hint messages. But I'm not sure if it's good to hint
> the use of pg_resetwal because, as you're saying, it should be used as
> a last resort and has some big risks like data loss, corruption,
> etc. That is, I'm concerned about that some users might quickly run
> pg_resetwal because hint message says that, without reading the docs
> nor understanding those risks.

I don't think we want to recommend pg_resetwal to those who don't
reach it by themselves by other means.  I know of a few instances of
people who had the tool (unrecoverably) break their own clusters.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: i.e. and e.g.

2022-07-20 Thread John Naylor
On Thu, Jul 21, 2022 at 11:22 AM Kyotaro Horiguchi 
wrote:
>
> At Thu, 21 Jul 2022 10:20:30 +0700, John Naylor <
john.nay...@enterprisedb.com> wrote in
> > On Wed, Jul 20, 2022 at 3:40 PM Kyotaro Horiguchi <
horikyota@gmail.com>
> > wrote:
> > > By the way, I forgot about back-branches. Don't we need to fix the
> > > same in back-branches?
> >
> > The intent of the messages is pretty clear to me, so I don't really see
a
> > need to change back branches. It does make sense for v15, though, and I
> > just forgot to consider that.
>
> Ok, I'm fine with that. Thanks!

This is done.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Add LZ4 compression in pg_dump

2022-07-20 Thread Michael Paquier
On Tue, Jul 05, 2022 at 01:22:47PM +, gkokola...@pm.me wrote:
> I have updated for "some" of the comments. This is not an unwillingness to
> incorporate those specific comments. Simply this patchset had started to 
> divert
> heavily already based on comments from Mr. Paquier who had already requested 
> for
> the APIs to be refactored to use function pointers. This is happening in 0002 
> of
> the patchset. 0001 of the patchset is using the new compression.h under 
> common.
> 
> This patchset should be considered a late draft, as commentary, documentation,
> and some finer details are not yet finalized; because I am expecting the 
> proposed
> refactor to receive a wealth of comments. It would be helpful to understand if
> the proposed direction is something worth to be worked upon, before moving to 
> the
> finer details.

I have read through the patch set, and I like a lot the separation you
are doing here with CompressFileHandle where a compression method has
to specify a full set of callbacks depending on the actions that need
to be taken.  One advantage, as you patch shows, is that you reduce
the dependency of each code path depending on the compression method,
with #ifdefs and such located mostly into their own file structure, so
as adding a new compression method becomes really easier.  These
callbacks are going to require much more documentation to describe
what anybody using them should expect from them, and perhaps they
could be renamed in a more generic way as the currect names come from
POSIX (say read_char(), read_string()?), even if this patch has just
inherited the names coming from pg_dump itself, but this can be tuned
over and over.

The split into three parts as of 0001 to plug into pg_dump the new
compression option set, 0002 to introduce the callbacks and 0003 to
add LZ4, building on the two first parts, makes sense to me.  0001 and
0002 could be done in a reversed order as they are mostly independent,
this order is fine as-is.

In short, I am fine with the proposed approach.

+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compressionMethod
+* in header */
Indeed, the dump format needs a version bump for this information.

+static bool
+parse_compression_option(const char *opt,
+pg_compress_specification *compress_spec)
This parsing logic in pg_dump.c looks a lot like what pg_receivewal.c
does with its parse_compress_options() where, for compatibility:
- If only a number is given:
-- Assume no compression if level is 0.
-- Assume gzip with given compression if level > 0.
- If a string is found, assume a full spec, with optionally a level.
So some consolidation could be done between both.

By the way, I can see that GZCLOSE(), etc. are still defined in
compress_io.h but they are not used.
--
Michael


signature.asc
Description: PGP signature


Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-20 Thread Michael Paquier
On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote:
> Yeah, it is not very clear to me either. I think this won't be
> difficult to change one or another way depending on future needs. At
> this stage, it appeared to me that bitmask is a better way to
> represent this information but if you and other feels using enum is a
> better idea then I am fine with that as well.

Please don't get me wrong :)

I favor a bitmask over an enum here, as you do, with a clean
layer for those flags.
--
Michael


signature.asc
Description: PGP signature


Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-20 Thread Amit Kapila
On Thu, Jul 21, 2022 at 7:10 AM Michael Paquier  wrote:
>
> On Wed, Jul 20, 2022 at 02:32:44PM +0530, Amit Kapila wrote:
> > On Thu, Jul 14, 2022 at 4:34 AM Peter Smith  wrote:
> >> By using a bitmask I think there is an implication that the flags can
> >> be combined...
> >>
> >> Perhaps it is not a problem today, but later you may want more flags. e.g.
> >> #define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */
> >>
> >> then the bitmask idea falls apart because IIUC you have no intentions
> >> to permit things like:
> >> (SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY)
> >
> > I think this will be an invalid combination if caller ever used it.
> > However, one might need to use a combination like
> > (SUBSCRIPTION_REL_STATE_READY | SUBSCRIPTION_REL_STATE_DONE). For such
> > cases, I feel the bitmask idea will be better.
>
> It feels unnatural to me to have a flag saying "not-ready" and one
> saying "ready", while we could have a flag saying "ready" that can be
> combined with a second flag to decide if the contents of srsubstate
> should be matched or *not* matched with the states expected by the
> caller.  This could be extended to more state values, for example.
>
> I am not sure if we actually need this much as I have no idea if
> future features would use it, so please take my suggestion lightly :)
>

Yeah, it is not very clear to me either. I think this won't be
difficult to change one or another way depending on future needs. At
this stage, it appeared to me that bitmask is a better way to
represent this information but if you and other feels using enum is a
better idea then I am fine with that as well.

-- 
With Regards,
Amit Kapila.




Re: making relfilenodes 56 bits

2022-07-20 Thread Thomas Munro
On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar  wrote:
> [v10 patch set]

Hi Dilip, I'm experimenting with these patches and will hopefully have
more to say soon, but I just wanted to point out that this builds with
warnings and failed on 3/4 of the CI OSes on cfbot's last run.  Maybe
there is the good kind of uninitialised data on Linux, and the bad
kind of uninitialised data on those other pesky systems?




Re: i.e. and e.g.

2022-07-20 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 10:20:30 +0700, John Naylor  
wrote in 
> On Wed, Jul 20, 2022 at 3:40 PM Kyotaro Horiguchi 
> wrote:
> > By the way, I forgot about back-branches. Don't we need to fix the
> > same in back-branches?
> 
> The intent of the messages is pretty clear to me, so I don't really see a
> need to change back branches. It does make sense for v15, though, and I
> just forgot to consider that.

Ok, I'm fine with that. Thanks!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

2022-07-20 Thread Kyotaro Horiguchi
At Wed, 20 Jul 2022 17:25:33 +0530, Bharath Rupireddy 
 wrote in 
> On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi
>  wrote:
> PSA v7 patch set.

Thanks. Looks perfect, but (sorry..) in the final checking, I found
"log archive" in the doc.  If you agree to it please merge the
attached (or refined one) and I'd call it a day.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index c0b89a3c01..e5344eb277 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2659,7 +2659,7 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
  
   
If set to true, the backup will wait until the last required WAL
-   segment has been archived, or emit a warning if log archiving is
+   segment has been archived, or emit a warning if WAL archiving is
not enabled. If false, the backup will neither wait nor warn,
leaving the client responsible for ensuring the required log is
available. The default is true.
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 56ac7b754b..e50f00afa8 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -318,7 +318,7 @@ PostgreSQL documentation
 backup. This will include all write-ahead logs generated during
 the backup. Unless the method none is specified,
 it is possible to start a postmaster in the target
-directory without the need to consult the log archive, thus
+directory without the need to consult the WAL archive, thus
 making the output a completely standalone backup.




Re: GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.

2022-07-20 Thread Amul Sul
Thanks Aleksander and Álvaro for your inputs.

I understand this change is not making any improvement to the current
code. I was a bit concerned regarding the design and consistency of
the function that exists for the server in recovery and for the server
that is not in recovery.  I was trying to write following snippet
where I am interested only for XLogRecPtr:

recPtr = RecoveryInProgress() ? GetStandbyFlushRecPtr(NULL) :
GetFlushRecPtr(NULL);

But I can't write this since I have to pass an argument for
GetStandbyFlushRecPtr() but that is not compulsory for
GetFlushRecPtr().

I agree to reject proposed changes since that is not useful
immediately and have no rule for the optional argument for the
similar-looking function.

Regards,
Amul




Re: Custom tuplesorts for extensions

2022-07-20 Thread John Naylor
On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov 
wrote:
> There are some places, which potentially could cause a slowdown.  I'm
> going to make some experiments with that.

I haven't looked at the patches, so I don't know of a specific place to
look for a slowdown, but I thought it might help to perform the same query
tests as my most recent test for evaluating qsort variants (some
description in [1]), and here is the spreadsheet. Overall, the differences
look like noise. A few cases with unabbreviatable text look a bit faster
with the patch. I'm not sure if that's a real difference, but in any case I
don't see a slowdown anywhere.

[1]
https://www.postgresql.org/message-id/CAFBsxsHeTACMP1JVQ%2Bm35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ%40mail.gmail.com
--
John Naylor
EDB: http://www.enterprisedb.com


master-vs-custom-ext-v2.ods
Description: application/vnd.oasis.opendocument.spreadsheet


Re: i.e. and e.g.

2022-07-20 Thread John Naylor
On Wed, Jul 20, 2022 at 3:40 PM Kyotaro Horiguchi 
wrote:
> By the way, I forgot about back-branches. Don't we need to fix the
> same in back-branches?

The intent of the messages is pretty clear to me, so I don't really see a
need to change back branches. It does make sense for v15, though, and I
just forgot to consider that.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Remove useless arguments in ReadCheckpointRecord().

2022-07-20 Thread Fujii Masao



On 2022/07/21 0:29, Bharath Rupireddy wrote:

How about we transform the following messages into something like below?

(errmsg("could not locate a valid checkpoint record"))); after
ReadCheckpointRecord() for control file cases to "could not locate
valid checkpoint record in control file"
(errmsg("could not locate required checkpoint record"), after
ReadCheckpointRecord() for backup_label case to "could not locate
valid checkpoint record in backup_label file"

The above messages give more meaningful and unique info to the users.


Agreed. Attached is the updated version of the patch.
Thanks for the review!



May be unrelated, IIRC, for the errors like ereport(PANIC,
(errmsg("could not locate a valid checkpoint record"))); we wanted to
add a hint asking users to consider running pg_resetwal to fix the
issue. The error for ReadCheckpointRecord() in backup_label file
cases, already gives a hint errhint("If you are restoring from a
backup, touch \"%s/recovery.signal\" .. Adding the hint of running
pg_resetwal (of course with a caution that it can cause inconsistency
in the data and use it as a last resort as described in the docs)
helps users and support engineers a lot to mitigate the server down
cases quickly.


+1 to add some hint messages. But I'm not sure if it's good to hint the use of 
pg_resetwal because, as you're saying, it should be used as a last resort and 
has some big risks like data loss, corruption, etc. That is, I'm concerned 
about that some users might quickly run pg_resetwal because hint message says 
that, without reading the docs nor understanding those risks.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONFrom a5e45acb753e9d93074a25a2fc409c693c46630c Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Wed, 20 Jul 2022 23:06:44 +0900
Subject: [PATCH] Remove useless arguments in ReadCheckpointRecord().

---
 src/backend/access/transam/xlogrecovery.c | 88 +--
 1 file changed, 17 insertions(+), 71 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c 
b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5e46..010f0cd7b2 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -422,8 +422,8 @@ static XLogPageReadResult 
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr,

  XLogRecPtr replayLSN,

  bool nonblocking);
 static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
-static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, 
XLogRecPtr RecPtr,
-   
int whichChkpt, bool report, TimeLineID replayTLI);
+static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher,
+   
XLogRecPtr RecPtr, TimeLineID replayTLI);
 static bool rescanLatestTimeLine(TimeLineID replayTLI, XLogRecPtr replayLSN);
 static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 XLogSource source, bool 
notfoundOk);
@@ -605,7 +605,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
 * When a backup_label file is present, we want to roll forward 
from
 * the checkpoint it identifies, rather than using pg_control.
 */
-   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 0, 
true,
+   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
  
CheckPointTLI);
if (record != NULL)
{
@@ -638,7 +638,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
else
{
ereport(FATAL,
-   (errmsg("could not locate required 
checkpoint record"),
+   (errmsg("could not locate a valid 
checkpoint record in backup_label file"),
 errhint("If you are restoring from a 
backup, touch \"%s/recovery.signal\" and add required recovery options.\n"
 "If you are not 
restoring from a backup, try removing the file \"%s/backup_label\".\n"
 "Be careful: removing 
\"%s/backup_label\" will result in a corrupt cluster if restoring from a 
backup.",
@@ -744,7 +744,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;

Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-20 Thread Gurjeet Singh
Moving the report from security to -hackers on Noah's advice. Since
the function(s) involved in the crash are not present in any of the
released versions, it is not considered a security issue.

I can confirm that this is reproducible on the latest commit on
master, 3c0bcdbc66. Below is the original analysis, followed by Noah's
analysis.

To be able to reproduce it, please note that perl support is required;
 hence `./configure --with-perl`.

The note about 'security concerns around on_plperl_init parameter',
below, refers to now-fixed issue, at commit 13d8388151.

Best regards,
Gurjeet
http://Gurje.et

Forwarded Conversation
Subject: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS


From: Gurjeet Singh 
Date: Mon, Jul 4, 2022 at 10:24 AM
To: Postgres Security 
Cc: Bossart, Nathan 


While poking at plperl's GUC in an internal discussion, I was able to
induce a crash (or an assertion failure in assert-enabled builds) as
an unprivileged user.

My investigation so far has revealed that the code path for the
following condition has never been tested, and because of this, when a
user tries to override an SUSET param via PGOPTIONS, Postgres tries to
perform a table lookup during process initialization. Because there's
no transaction in progress, and because this table is not in the
primed caches, we end up with code trying to dereference an
uninitialized  CurrentResourceOwner.

The condition:
User specifies PGOPTIONS"-c custom.param"
"custom.param" is used by an extension which is specified in
session_preload_libraries
The extension uses DefineCustom*Variable("custom.param", PGC_SUSET)
inside set_config_option()
  record->context == PGC_SUSET
  context == PGC_BACKEND
  calls pg_parameter_aclcheck()  -> eventually leads to
assertion-failure (or crash, when assertions are disabled)

See below for 1. How to reproduce, 2. Assertion failure stack, and 3.
Crash stack

When the user does not specify PGOPTIONS, the code in
define_custom_variable() returns prematurely, after a failed bsearch()
lookup, and hence avoids this bug.

I think similar crash can be  induced when the custom parameter is of
kind PGC_SU_BACKEND, because the code to handle that also invokes
pg_parameter_aclcheck(). Also, I believe the same condition would
arise if the extension is specified local_preload_libraries.

I haven't  been  able to think of an attack vector using this bug, but
it can be used to cause a denial-of-service by an authenticated user.
I'm sending this report to security list, instead of -hackers, out of
abundance of caution; please feel free to move it to -hackers, if it's
not considered a security concern.

I discovered this bug a couple of days ago, just before leaving on a
trip. But because of shortage of time over the weekend, I haven't been
able to dig deeper into it. Since I don't think I'll be able to spend
any significant time on it for at least another couple of days, I'm
sending this report without a patch or a proposed fix.

CC: Nathan, whose security concerns around on_plperl_init parameter
lead to this discovery.

[1]: How to reproduce

$ psql -c 'create user test'
CREATE ROLE

$ psql -c "alter system set session_preload_libraries='plperl'"
ALTER SYSTEM

$ # restart server

$ psql -c 'show session_preload_libraries'
 session_preload_libraries
---
 plperl
(1 row)

$ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test
psql: error: connection to server on socket "/tmp/.s.psql.5432"
failed: server closed the connection unexpectedly
┆   This probably means the server terminated abnormally
before or while processing the request.


[2]:  Assertion failure stack

LOG:  database system is ready to accept connections
TRAP: FailedAssertion("IsTransactionState()", File:
"../../../../../../POSTGRES/src/backend/utils/cache/catcache.c", Line:
1209, P
ID: 199868)
postgres: test postgres [local]
startup(ExceptionalCondition+0xd0)[0x55e503a4e6c9]
postgres: test postgres [local] startup(+0x7e069b)[0x55e503a2a69b]
postgres: test postgres [local] startup(SearchCatCache1+0x3a)[0x55e503a2a56b]
postgres: test postgres [local] startup(SearchSysCache1+0xc1)[0x55e503a46fe4]
postgres: test postgres [local]
startup(pg_parameter_aclmask+0x6f)[0x55e50345f098]
postgres: test postgres [local]
startup(pg_parameter_aclcheck+0x2d)[0x55e50346039c]
postgres: test postgres [local] startup(set_config_option+0x450)[0x55e503a70727]
postgres: test postgres [local] startup(+0x829ce8)[0x55e503a73ce8]
postgres: test postgres [local]
startup(DefineCustomStringVariable+0xa4)[0x55e503a74306]
/home/vagrant/dev/POSTGRES_builds/add_tz_param/db/lib/postgresql/plperl.so(_PG_init+0xd7)[0x7fed3d845425]
postgres: test postgres [local] startup(+0x80cc50)[0x55e503a56c50]
postgres: test postgres [local] startup(load_file+0x43)[0x55e503a566d9]
postgres: test postgres [local] startup(+0x81ba89)[0x55e503a65a89]
postgres: test postgres [local]

Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-07-20 Thread Richard Guo
On Thu, Jul 21, 2022 at 3:47 AM David Rowley  wrote:

> On Wed, 20 Jul 2022 at 21:19, Richard Guo  wrote:
> > So the idea is if the ECs used by the mergeclauses are prefix of the
> > query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why
> > not relax this further that if the ECs in the mergeclauses and the
> > query_pathkeys have common prefix, we use that prefix as pathkeys? So
> > that we can have a plan like below:
>
> I don't think that's a clear-cut win. There is scoring code in there
> to try to arrange the pathkey list in the order of
> most-useful-to-upper-level-joins firsts.  If we were to do as you
> describe we could end up generating worse plans when there is some
> subsequent Merge Join above this one that has join conditions that the
> query_pathkeys are not compatible with.


Yeah, you're right. Although we would try different permutation of the
pathkeys in sort_inner_and_outer() but that does not cover every
possible ordering due to cost consideration. So we still need to respect
the heuristics behind the pathkey order returned by this function, which
is the scoring logic trying to list most-useful-to-upper-level-joins
keys earlier.


> Maybe your idea could be made to work in cases where
> bms_equal(joinrel->relids, root->all_baserels). In that case, we
> should not be processing any further joins and don't need to consider
> that as a factor for the scoring.


That should work, as long as this case is common enough to worth we
writing the codes.

Thanks
Richard


Re: Improve description of XLOG_RUNNING_XACTS

2022-07-20 Thread Fujii Masao




On 2022/07/21 10:13, Masahiko Sawada wrote:

Hi,

I realized that standby_desc_running_xacts() in standbydesc.c doesn't
describe subtransaction XIDs. I've attached the patch to improve the
description.


+1

The patch looks good to me.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Allowing REINDEX to have an optional name

2022-07-20 Thread Michael Paquier
On Wed, Jul 20, 2022 at 07:27:07AM -0500, Justin Pryzby wrote:
> Looks fine

Thanks for checking, applied.
--
Michael


signature.asc
Description: PGP signature


Re: Memory leak fix in psql

2022-07-20 Thread Michael Paquier
On Thu, Jul 21, 2022 at 09:10:43AM +0800, Japin Li wrote:
> Yeah, we should take care of the backpatch risk.  However, I think
> it makes sense to backpatch.

We are talking about 256 bytes being leaked in each loop when a
validation pattern or when a query fails, so I don't see a strong
argument in manipulating 10~14 more than necessary for this amount of
memory.  The contents of describe.c are the same for v15 though, and
we are still in beta on REL_15_STABLE, so I have applied the patch
down to v15, adding what Alvaro has sent on top of the rest.
--
Michael


signature.asc
Description: PGP signature


Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-20 Thread Michael Paquier
On Wed, Jul 20, 2022 at 02:32:44PM +0530, Amit Kapila wrote:
> On Thu, Jul 14, 2022 at 4:34 AM Peter Smith  wrote:
>> By using a bitmask I think there is an implication that the flags can
>> be combined...
>>
>> Perhaps it is not a problem today, but later you may want more flags. e.g.
>> #define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */
>>
>> then the bitmask idea falls apart because IIUC you have no intentions
>> to permit things like:
>> (SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY)
> 
> I think this will be an invalid combination if caller ever used it.
> However, one might need to use a combination like
> (SUBSCRIPTION_REL_STATE_READY | SUBSCRIPTION_REL_STATE_DONE). For such
> cases, I feel the bitmask idea will be better.

It feels unnatural to me to have a flag saying "not-ready" and one
saying "ready", while we could have a flag saying "ready" that can be
combined with a second flag to decide if the contents of srsubstate
should be matched or *not* matched with the states expected by the
caller.  This could be extended to more state values, for example.

I am not sure if we actually need this much as I have no idea if
future features would use it, so please take my suggestion lightly :)
--
Michael


signature.asc
Description: PGP signature


Re: System catalog documentation chapter

2022-07-20 Thread Bruce Momjian
On Wed, Jul 20, 2022 at 04:32:46PM -0400, Bruce Momjian wrote:
> On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote:
> > Will there be a comprehensive list somewhere? Even if it just lists the 
> > views,
> > gives maybe a one-sentence description, and links to the relevant chapter, I
> > would find that helpful sometimes.
> 
> I was not planning on that since we don't do that in any other cases I
> can think of.
> 
> > I ask because I occasionally find myself wanting a comprehensive list of
> > functions, and as far as I can tell it doesn't exist. I'm hoping to avoid 
> > that
> > situation for views.
> 
> Well, then we just leave them where the are and link to them from other
> parts of the documentation, which I assume/hope we already do.

People have mentioned the view documentation doesn't belong in the
Internals part.  Maybe we can just move it to the Server
Administration part and leave it together.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Improve description of XLOG_RUNNING_XACTS

2022-07-20 Thread Masahiko Sawada
Hi,

I realized that standby_desc_running_xacts() in standbydesc.c doesn't
describe subtransaction XIDs. I've attached the patch to improve the
description. Here is an example by pg_wlaldump:

* HEAD
rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050
latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048

* w/ patch
rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050
latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048; 1
subxacts: 1049

Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


0001-Improve-description-of-XLOG_RUNNING_XACTS.patch
Description: Binary data


Re: Memory leak fix in psql

2022-07-20 Thread Japin Li


On Wed, 20 Jul 2022 at 21:54, Tom Lane  wrote:
> Japin Li  writes:
>> Thanks for updating the patch.  It looks good.  However, it cannot be
>> applied on 14 stable.  The attached patches are for 10-14.
>
> While I think this is good cleanup, I'm doubtful that any of these
> leaks are probable enough to be worth back-patching into stable
> branches.  The risk of breaking something should not be neglected.
>

Yeah, we should take care of the backpatch risk.  However, I think
it makes sense to backpatch.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Memory leak fix in psql

2022-07-20 Thread Michael Paquier
On Wed, Jul 20, 2022 at 10:05:47AM +0200, Alvaro Herrera wrote:
> More on the same tune.

Thanks.  I have noticed that as well.  I'll include all that in the
set.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Log details for client certificate failures

2022-07-20 Thread Tom Lane
Jacob Champion  writes:
> I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s,
> because that seems to be a common case for the check hooks.

Really?  That's almost certainly NOT okay.  As an example, if you
have a problem with a new value loaded from postgresql.conf during
SIGHUP processing, throwing ERROR will cause the postmaster to exit.

I wouldn't be too surprised if there are isolated cases where people
didn't understand what they were doing and wrote that, but that
needs to be fixed not emulated.

regards, tom lane




Re: [PATCH] Log details for client certificate failures

2022-07-20 Thread Jacob Champion
On Wed, Jul 20, 2022 at 3:15 PM Tom Lane  wrote:
> guc_malloc's behavior varies depending on elevel.  It's *not*
> equivalent to palloc.

Right, sorry -- a better way for me to ask the question:

I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s,
because that seems to be a common case for the check hooks. If that's
okay, is there any reason not to use palloc() semantics for
pg_clean_ascii()? (And if it's not okay, why?)

--Jacob




Re: [PATCH] Log details for client certificate failures

2022-07-20 Thread Tom Lane
Jacob Champion  writes:
> The guc_strdup() approach really reduces the amount of code, so that's
> what I did in v3. I'm not following why we need to return NULL on
> failure, though -- both palloc() and guc_malloc() ERROR on failure, so
> is it okay to keep those semantics the same?

guc_malloc's behavior varies depending on elevel.  It's *not*
equivalent to palloc.

regards, tom lane




Re: [PATCH] Log details for client certificate failures

2022-07-20 Thread Jacob Champion
On Tue, Jul 19, 2022 at 3:38 PM Andres Freund  wrote:
> Or alternatively, perhaps we can just make pg_clean_ascii() return NULL
> if allocation failed and then guc_strdup() the result in guc.c?

The guc_strdup() approach really reduces the amount of code, so that's
what I did in v3. I'm not following why we need to return NULL on
failure, though -- both palloc() and guc_malloc() ERROR on failure, so
is it okay to keep those semantics the same?

> If we end up needing a two phase approach, why use the same function for
> both phases? That seems quite awkward.

Mostly so the byte counting always agrees between the two phases, no
matter how the implementation evolves. But it's hopefully moot now.

--Jacob
From dfd76f4dbcf3834371442d593d315797762bbd11 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Tue, 19 Jul 2022 10:48:58 -0700
Subject: [PATCH v3 1/2] pg_clean_ascii(): escape bytes rather than lose them

Rather than replace each unprintable byte with a '?' character, replace
it with a hex escape instead. The API now allocates a copy rather than
modifying the input in place.
---
 src/backend/postmaster/postmaster.c |  6 +-
 src/backend/utils/misc/guc.c|  4 ++--
 src/common/string.c | 26 +-
 src/include/common/string.h |  2 +-
 4 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 1c25457526..5e8cd770c0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2284,11 +2284,7 @@ retry1:
  */
 if (strcmp(nameptr, "application_name") == 0)
 {
-	char	   *tmp_app_name = pstrdup(valptr);
-
-	pg_clean_ascii(tmp_app_name);
-
-	port->application_name = tmp_app_name;
+	port->application_name = pg_clean_ascii(valptr);
 }
 			}
 			offset = valoffset + strlen(valptr) + 1;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0328029d43..60400752e5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -12480,7 +12480,7 @@ static bool
 check_application_name(char **newval, void **extra, GucSource source)
 {
 	/* Only allow clean ASCII chars in the application name */
-	pg_clean_ascii(*newval);
+	*newval = guc_strdup(ERROR, pg_clean_ascii(*newval));
 
 	return true;
 }
@@ -12496,7 +12496,7 @@ static bool
 check_cluster_name(char **newval, void **extra, GucSource source)
 {
 	/* Only allow clean ASCII chars in the cluster name */
-	pg_clean_ascii(*newval);
+	*newval = guc_strdup(ERROR, pg_clean_ascii(*newval));
 
 	return true;
 }
diff --git a/src/common/string.c b/src/common/string.c
index 16940d1fa7..db15324c62 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "lib/stringinfo.h"
 
 
 /*
@@ -59,9 +60,9 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 
 
 /*
- * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
+ * pg_clean_ascii -- Replace any non-ASCII chars with a "\xXX" string
  *
- * Modifies the string passed in which must be '\0'-terminated.
+ * Makes a palloc'd copy of the string passed in, which must be '\0'-terminated.
  *
  * This function exists specifically to deal with filtering out
  * non-ASCII characters in a few places where the client can provide an almost
@@ -73,22 +74,29 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
  * In general, this function should NOT be used- instead, consider how to handle
  * the string without needing to filter out the non-ASCII characters.
  *
- * Ultimately, we'd like to improve the situation to not require stripping out
- * all non-ASCII but perform more intelligent filtering which would allow UTF or
+ * Ultimately, we'd like to improve the situation to not require replacing all
+ * non-ASCII but perform more intelligent filtering which would allow UTF or
  * similar, but it's unclear exactly what we should allow, so stick to ASCII only
  * for now.
  */
-void
-pg_clean_ascii(char *str)
+char *
+pg_clean_ascii(const char *str)
 {
-	/* Only allow clean ASCII chars in the string */
-	char	   *p;
+	StringInfoData buf;
+	const char   *p;
+
+	initStringInfo();
 
 	for (p = str; *p != '\0'; p++)
 	{
+		/* Only allow clean ASCII chars in the string */
 		if (*p < 32 || *p > 126)
-			*p = '?';
+			appendStringInfo(, "\\x%02x", (unsigned char) *p);
+		else
+			appendStringInfoChar(, *p);
 	}
+
+	return buf.data;
 }
 
 
diff --git a/src/include/common/string.h b/src/include/common/string.h
index cf00fb53cd..d10d0c9cbf 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -24,7 +24,7 @@ typedef struct PromptInterruptContext
 extern bool pg_str_endswith(const char *str, const char *end);
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 	 int base);
-extern void pg_clean_ascii(char *str);
+extern char 

Re: System catalog documentation chapter

2022-07-20 Thread Bruce Momjian
On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote:
> On Wed, 20 Jul 2022 at 16:08, Bruce Momjian  wrote:
> 
> On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote:
> > I am going to look at moving system views that make sense into the
> > chapters where their contents are mentioned.  I don't think having a
> > central list of views is really helping us because we expect the views
> > to be used in ways the system catalogs would not be.
> 
> I have grouped the views by topic.  What I would like to do next is to
> move these view sections to the end of relevant documentation chapters.
> Is that going to be an improvement?
> 
> 
> Will there be a comprehensive list somewhere? Even if it just lists the views,
> gives maybe a one-sentence description, and links to the relevant chapter, I
> would find that helpful sometimes.

I was not planning on that since we don't do that in any other cases I
can think of.

> I ask because I occasionally find myself wanting a comprehensive list of
> functions, and as far as I can tell it doesn't exist. I'm hoping to avoid that
> situation for views.

Well, then we just leave them where the are and link to them from other
parts of the documentation, which I assume/hope we already do.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: System catalog documentation chapter

2022-07-20 Thread Isaac Morland
On Wed, 20 Jul 2022 at 16:08, Bruce Momjian  wrote:

> On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote:
> > I am going to look at moving system views that make sense into the
> > chapters where their contents are mentioned.  I don't think having a
> > central list of views is really helping us because we expect the views
> > to be used in ways the system catalogs would not be.
>
> I have grouped the views by topic.  What I would like to do next is to
> move these view sections to the end of relevant documentation chapters.
> Is that going to be an improvement?


Will there be a comprehensive list somewhere? Even if it just lists the
views, gives maybe a one-sentence description, and links to the relevant
chapter, I would find that helpful sometimes.

I ask because I occasionally find myself wanting a comprehensive list of
functions, and as far as I can tell it doesn't exist. I'm hoping to avoid
that situation for views.


Re: System catalog documentation chapter

2022-07-20 Thread Bruce Momjian
On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote:
> I am going to look at moving system views that make sense into the
> chapters where their contents are mentioned.  I don't think having a
> central list of views is really helping us because we expect the views
> to be used in ways the system catalogs would not be.

I have grouped the views by topic.  What I would like to do next is to
move these view sections to the end of relevant documentation chapters.
Is that going to be an improvement?

---

pg_available_extensions
pg_available_extension_versions

pg_backend_memory_contexts

pg_config

pg_cursors

pg_file_settings
pg_hba_file_rules
pg_ident_file_mappings
pg_settings

pg_locks

pg_policies

pg_prepared_statements

pg_prepared_xacts

pg_publication_tables
pg_replication_origin_status
pg_replication_slots

pg_group
pg_roles
pg_shadow
pg_user
pg_user_mappings

pg_shmem_allocations

pg_stats
pg_stats_ext
pg_stats_ext_exprs

pg_timezone_abbrevs
pg_timezone_names

pg_indexes
pg_matviews
pg_rules
pg_seclabels
pg_sequences
pg_tables
pg_views

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-07-20 Thread David Rowley
On Wed, 20 Jul 2022 at 21:19, Richard Guo  wrote:
> So the idea is if the ECs used by the mergeclauses are prefix of the
> query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why
> not relax this further that if the ECs in the mergeclauses and the
> query_pathkeys have common prefix, we use that prefix as pathkeys? So
> that we can have a plan like below:

I don't think that's a clear-cut win. There is scoring code in there
to try to arrange the pathkey list in the order of
most-useful-to-upper-level-joins firsts.  If we were to do as you
describe we could end up generating worse plans when there is some
subsequent Merge Join above this one that has join conditions that the
query_pathkeys are not compatible with.

Maybe your idea could be made to work in cases where
bms_equal(joinrel->relids, root->all_baserels). In that case, we
should not be processing any further joins and don't need to consider
that as a factor for the scoring.

David




Re: pg_auth_members.grantor is bunk

2022-07-20 Thread Robert Haas
On Fri, Jun 24, 2022 at 4:46 PM Robert Haas  wrote:
> On Fri, Jun 24, 2022 at 4:30 PM David G. Johnston
>  wrote:
> >> Upthread, I proposed that "drop role baz" should fail here
> >
> > I concur with this.
> >
> > I think that the grantor owns the grant, and that REASSIGNED OWNED should 
> > be able to move those grants to someone else.
> >
> > By extension, DROP OWNED should remove them.
>
> Interesting. I hadn't thought about changing the behavior of DROP
> OWNED BY and REASSIGN OWNED BY. A quick experiment supports your
> interpretation:

This experiment was insufficiently thorough. I see now that, for other
object types, DROP OWNED BY does work in the way that you propose, but
REASSIGN OWNED BY does not. Here's a better test:

rhaas=# create table foo();
CREATE TABLE
rhaas=# create role bar;
CREATE ROLE
rhaas=# create role baz;
CREATE ROLE
rhaas=# grant select on table foo to bar with grant option;
GRANT
rhaas=# set role bar;
SET
rhaas=> grant select on table foo to baz;
GRANT
rhaas=> reset role;
RESET
rhaas=# drop role bar;
ERROR:  role "bar" cannot be dropped because some objects depend on it
DETAIL:  privileges for table foo
rhaas=# create role quux;
CREATE ROLE
rhaas=# reassign owned by bar to quux;
REASSIGN OWNED
rhaas=# drop role bar;
ERROR:  role "bar" cannot be dropped because some objects depend on it
DETAIL:  privileges for table foo
rhaas=# drop owned by bar;
DROP OWNED
rhaas=# drop role bar;
DROP ROLE

This behavior might look somewhat bizarre, but there's actually a good
reason for it: the system guarantees that whoever is listed as the
grantor of a privilege has the *current* right to grant that
privilege. It can't categorically change the grantor of every
privilege given by bar to quux because quux might not and in fact does
not have the right to grant select on table foo to baz. Now, you might
be thinking, ah, but what if the superuser performed the grant? They
could cease to be the superuser later, and then the rule would be
violated! But actually not, because a grant by the superuser is
imputed to the table owner, who always has the right to grant all
rights on the table, and if the table owner is ever changed, all the
grants imputed to the old table owner are changed to have their
grantor as the new table owner. Similarly, trying to revoke select, or
the grant option on it, from bar would fail. So it looks pretty
intentional, and pretty tightly-enforced, that every role listed as a
grantor must be one which is currently able to grant that privilege.

And that means that REASSIGN OWNED can't just do a blanket change to
the recorded grantor. It could try to do so, I suppose, and just throw
an error if it doesn't work out, but that might make REASSIGN OWNED
fail a lot more often, which could suck. In any event, the implemented
behavior is that REASSIGN OWNED does nothing about permissions, but
DROP OWNED cascades to grantors. This is SORT OF documented, although
the documentation only mentions that DROP OWNED cascades to privileges
granted *to* the target role, and does not mention that it also
cascades to privileges granted *by* the target role.

The previous version of the patch makes both DROP OWNED and REASSIGN
OWNED cascade to grantors, but I now think that, for consistency, I'd
better look into changing it so that only DROP OWNED cascades. I think
perhaps I should be using SHARED_DEPENDENCY_ACL instead of
SHARED_DEPENDENCY_OWNER.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: shared-memory based stats collector - v70

2022-07-20 Thread Andres Freund
Hi, 

On July 20, 2022 8:41:53 PM GMT+02:00, Greg Stark  wrote:
>On Wed, 20 Jul 2022 at 12:08, Tom Lane  wrote:
>>
>> AFAIR, the previous stats collector implementation had no such provision
>> either: it'd just keep adding hashtable entries as it received info about
>> new objects.  The only thing that's changed is that now those entries are
>> in shared memory instead of process-local memory.  We'd be well advised to
>> be sure that memory can be swapped out under pressure, but otherwise I'm
>> not seeing that things have gotten worse.
>
>Just to be clear I'm not looking for ways things have gotten worse.
>Just trying to understand what I'm reading and I guess I came in with
>assumptions that led me astray.
>
>But... adding entries as it received info about new objects isn't the
>same as having info on everything. I didn't really understand how the
>old system worked but if you had a very large schema but each session
>only worked with a small subset did the local stats data ever absorb
>info on the objects it never touched?

Each backend only had stats for things it touched. But the stats collector read 
all files at startup into hash tables and absorbed all generated stats into 
those as well.


>All that said -- having all objects loaded in shared memory makes my
>work way easier.

What are your trying to do? 

>It actually seems feasible to dump out all the
>objects from shared memory and including objects from other databases
>and if I don't need a consistent snapshot it even seems like it would
>be possible to do that without having a copy of more than one stats
>entry at a time in local memory. I hope that doesn't cause huge
>contention on the shared hash table to be doing that regularly.

The stats accessors now default to not creating a full snapshot of stats data 
at first access (but that's configurable). So yes, that behavior is possible. 
E.g. autovac now uses a single object access like you describe.

Andres


-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: make -C libpq check fails obscurely if tap tests are disabled

2022-07-20 Thread Andrew Dunstan
On 2022-07-20 We 13:23, Justin Pryzby wrote:
> make -C ./src/interfaces/libpq check
> PATH=... && @echo "TAP tests not enabled. Try configuring with 
> --enable-tap-tests"
> /bin/sh: 1: @echo: not found
>
> make is telling the shell to run "@echo" , rather than running "echo" 
> silently.
>
> Since:
>
> commit 6b04abdfc5e0653542ac5d586e639185a8c61a39
> Author: Andres Freund 
> Date:   Sat Feb 26 16:51:47 2022 -0800
>
> Run tap tests in src/interfaces/libpq.



Yeah. It's a bit ugly but I think the attached would fix it.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index b5fd72a4ac..50aba04a4f 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -144,10 +144,18 @@ test-build:
 	$(MAKE) -C test all
 
 check: test-build all
+ifeq ($(enable_tap_tests),yes)
 	PATH="$(CURDIR)/test:$$PATH" && $(prove_check)
+else
+	$(prove_check)
+endif
 
 installcheck: test-build all
+ifeq ($(enable_tap_tests),yes)
 	PATH="$(CURDIR)/test:$$PATH" && $(prove_installcheck)
+else
+	$(prove_installcheck)
+endif
 
 installdirs: installdirs-lib
 	$(MKDIR_P) '$(DESTDIR)$(includedir)' '$(DESTDIR)$(includedir_internal)' '$(DESTDIR)$(datadir)'


Re: shared-memory based stats collector - v70

2022-07-20 Thread Greg Stark
On Wed, 20 Jul 2022 at 12:08, Tom Lane  wrote:
>
> AFAIR, the previous stats collector implementation had no such provision
> either: it'd just keep adding hashtable entries as it received info about
> new objects.  The only thing that's changed is that now those entries are
> in shared memory instead of process-local memory.  We'd be well advised to
> be sure that memory can be swapped out under pressure, but otherwise I'm
> not seeing that things have gotten worse.

Just to be clear I'm not looking for ways things have gotten worse.
Just trying to understand what I'm reading and I guess I came in with
assumptions that led me astray.

But... adding entries as it received info about new objects isn't the
same as having info on everything. I didn't really understand how the
old system worked but if you had a very large schema but each session
only worked with a small subset did the local stats data ever absorb
info on the objects it never touched?

All that said -- having all objects loaded in shared memory makes my
work way easier. It actually seems feasible to dump out all the
objects from shared memory and including objects from other databases
and if I don't need a consistent snapshot it even seems like it would
be possible to do that without having a copy of more than one stats
entry at a time in local memory. I hope that doesn't cause huge
contention on the shared hash table to be doing that regularly.

-- 
greg




Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-07-20 Thread Melanie Plageman
On Wed, Jul 20, 2022 at 12:50 PM Andres Freund  wrote:

>
> On 2022-07-14 18:44:48 -0400, Melanie Plageman wrote:
>
> > @@ -1427,8 +1445,10 @@ pgstat_read_statsfile(void)
> >   FILE   *fpin;
> >   int32   format_id;
> >   boolfound;
> > + PgStat_BackendIOPathOps io_stats;
> >   const char *statfile = PGSTAT_STAT_PERMANENT_FILENAME;
> >   PgStat_ShmemControl *shmem = pgStatLocal.shmem;
> > + PgStatShared_BackendIOPathOps *io_stats_shmem = >io_ops;
> >
> >   /* shouldn't be called from postmaster */
> >   Assert(IsUnderPostmaster || !IsPostmasterEnvironment);
> > @@ -1486,6 +1506,22 @@ pgstat_read_statsfile(void)
> >   if (!read_chunk_s(fpin, >checkpointer.stats))
> >   goto error;
> >
> > + /*
> > +  * Read IO Operations stats struct
> > +  */
> > + if (!read_chunk_s(fpin, _stats))
> > + goto error;
> > +
> > + io_stats_shmem->stat_reset_timestamp =
> io_stats.stat_reset_timestamp;
> > +
> > + for (int i = 0; i < BACKEND_NUM_TYPES; i++)
> > + {
> > + PgStat_IOPathOps *stats = _stats.stats[i];
> > + PgStatShared_IOPathOps *stats_shmem =
> _stats_shmem->stats[i];
> > +
> > + memcpy(stats_shmem->data, stats->data,
> sizeof(stats->data));
> > + }
>
> Why can't the data be read directly into shared memory?
>
>
It is not the same lock. Each PgStatShared_IOPathOps has a lock so that
they can be accessed individually (per BackendType in
PgStatShared_BackendIOPathOps). It is optimized for the more common
operation of flushing at the expense of the snapshot operation (which
should be less common) and reset operation.


> > +void
> > +pgstat_io_ops_snapshot_cb(void)
> > +{
> > + PgStatShared_BackendIOPathOps *all_backend_stats_shmem =
> >io_ops;
> > + PgStat_BackendIOPathOps *all_backend_stats_snap =
> _ops;
> > +
> > + for (int i = 0; i < BACKEND_NUM_TYPES; i++)
> > + {
> > + PgStatShared_IOPathOps *stats_shmem =
> _backend_stats_shmem->stats[i];
> > + PgStat_IOPathOps *stats_snap =
> _backend_stats_snap->stats[i];
> > +
> > + LWLockAcquire(_shmem->lock, LW_EXCLUSIVE);
>
> Why acquire the same lock repeatedly for each type, rather than once for
> the whole?
>
>
This is also because of having a LWLock in each PgStatShared_IOPathOps.
Because I don't want a lock in the backend local stats, I have two data
structures PgStatShared_IOPathOps and PgStat_IOPathOps. I thought it was
odd to write out the lock to the file, so when persisting the stats, I
write out the relevant data only and when reading it back in to shared
memory,  I read in the data member of PgStatShared_IOPathOps.


make -C libpq check fails obscurely if tap tests are disabled

2022-07-20 Thread Justin Pryzby
make -C ./src/interfaces/libpq check
PATH=... && @echo "TAP tests not enabled. Try configuring with 
--enable-tap-tests"
/bin/sh: 1: @echo: not found

make is telling the shell to run "@echo" , rather than running "echo" silently.

Since:

commit 6b04abdfc5e0653542ac5d586e639185a8c61a39
Author: Andres Freund 
Date:   Sat Feb 26 16:51:47 2022 -0800

Run tap tests in src/interfaces/libpq.

-- 
Justin




Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-07-20 Thread Andres Freund
Hi,

On 2022-07-14 18:44:48 -0400, Melanie Plageman wrote:
> Subject: [PATCH v26 1/4] Add BackendType for standalone backends
> Subject: [PATCH v26 2/4] Remove unneeded call to pgstat_report_wal()

LGTM.


> Subject: [PATCH v26 3/4] Track IO operation statistics

> @@ -978,8 +979,17 @@ ReadBuffer_common(SMgrRelation smgr, char 
> relpersistence, ForkNumber forkNum,
>  
>   bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : 
> BufHdrGetBlock(bufHdr);
>  
> + if (isLocalBuf)
> + io_path = IOPATH_LOCAL;
> + else if (strategy != NULL)
> + io_path = IOPATH_STRATEGY;
> + else
> + io_path = IOPATH_SHARED;

Seems a bit ugly to have an if (isLocalBuf) just after an isLocalBuf ?.


> + /*
> +  * When a strategy is in use, reused buffers from the 
> strategy ring will
> +  * be counted as allocations for the purposes of IO 
> Operation statistics
> +  * tracking.
> +  *
> +  * However, even when a strategy is in use, if a new 
> buffer must be
> +  * allocated from shared buffers and added to the ring, 
> this is counted
> +  * as a IOPATH_SHARED allocation.
> +  */

There's a bit too much duplication between the paragraphs...

> @@ -628,6 +637,9 @@ pgstat_report_stat(bool force)
>   /* flush database / relation / function / ... stats */
>   partial_flush |= pgstat_flush_pending_entries(nowait);
>  
> + /* flush IO Operations stats */
> + partial_flush |= pgstat_flush_io_ops(nowait);

Could you either add a note to the commit message that the stats file
version needs to be increased, or just iclude that in the patch.




> @@ -1427,8 +1445,10 @@ pgstat_read_statsfile(void)
>   FILE   *fpin;
>   int32   format_id;
>   boolfound;
> + PgStat_BackendIOPathOps io_stats;
>   const char *statfile = PGSTAT_STAT_PERMANENT_FILENAME;
>   PgStat_ShmemControl *shmem = pgStatLocal.shmem;
> + PgStatShared_BackendIOPathOps *io_stats_shmem = >io_ops;
>  
>   /* shouldn't be called from postmaster */
>   Assert(IsUnderPostmaster || !IsPostmasterEnvironment);
> @@ -1486,6 +1506,22 @@ pgstat_read_statsfile(void)
>   if (!read_chunk_s(fpin, >checkpointer.stats))
>   goto error;
>  
> + /*
> +  * Read IO Operations stats struct
> +  */
> + if (!read_chunk_s(fpin, _stats))
> + goto error;
> +
> + io_stats_shmem->stat_reset_timestamp = io_stats.stat_reset_timestamp;
> +
> + for (int i = 0; i < BACKEND_NUM_TYPES; i++)
> + {
> + PgStat_IOPathOps *stats = _stats.stats[i];
> + PgStatShared_IOPathOps *stats_shmem = _stats_shmem->stats[i];
> +
> + memcpy(stats_shmem->data, stats->data, sizeof(stats->data));
> + }

Why can't the data be read directly into shared memory?


>   /*


> +void
> +pgstat_io_ops_snapshot_cb(void)
> +{
> + PgStatShared_BackendIOPathOps *all_backend_stats_shmem = 
> >io_ops;
> + PgStat_BackendIOPathOps *all_backend_stats_snap = 
> _ops;
> +
> + for (int i = 0; i < BACKEND_NUM_TYPES; i++)
> + {
> + PgStatShared_IOPathOps *stats_shmem = 
> _backend_stats_shmem->stats[i];
> + PgStat_IOPathOps *stats_snap = 
> _backend_stats_snap->stats[i];
> +
> + LWLockAcquire(_shmem->lock, LW_EXCLUSIVE);

Why acquire the same lock repeatedly for each type, rather than once for
the whole?


> + /*
> +  * Use the lock in the first BackendType's PgStat_IOPathOps to 
> protect the
> +  * reset timestamp as well.
> +  */
> + if (i == 0)
> + all_backend_stats_snap->stat_reset_timestamp = 
> all_backend_stats_shmem->stat_reset_timestamp;

Which also would make this look a bit less awkward.

Starting to look pretty good...

- Andres




Re: shared-memory based stats collector - v70

2022-07-20 Thread Andres Freund
Hi,

On 2022-07-20 12:08:35 -0400, Tom Lane wrote:
> AFAIR, the previous stats collector implementation had no such provision
> either: it'd just keep adding hashtable entries as it received info about
> new objects.

Yep.


> The only thing that's changed is that now those entries are in shared
> memory instead of process-local memory.  We'd be well advised to be
> sure that memory can be swapped out under pressure, but otherwise I'm
> not seeing that things have gotten worse.

FWIW, I ran a few memory usage benchmarks. Without stats accesses the
memory usage with shared memory stats was sometimes below, sometimes
above the "old" memory usage, depending on the number of objects. As
soon as there's stats access, it's well below (that includes things like
autovac workers).

I think there's quite a bit of memory usage reduction potential around
dsa.c - we occasionally end up with [nearly] unused dsm segments.

Greetings,

Andres Freund




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-20 Thread Alvaro Herrera
On 2022-Jul-20, Alvaro Herrera wrote:

> On 2022-Jul-20, Alvaro Herrera wrote:
> 
> > I also change the use of allow_invalid_pages to
> > allow_in_place_tablespaces.  We could add a
> > separate GUC for this, but it seems overengineering.
> 
> Oh, but allow_in_place_tablespaces doesn't exist in versions 14 and
> older, so this strategy doesn't really work.

... and get_dirent_type is new in 14, so that'll be one more hurdle.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Cuando no hay humildad las personas se degradan" (A. Christie)




Re: Logical insert/update/delete WAL records for custom table AMs

2022-07-20 Thread Jeff Davis
On Mon, 2022-03-21 at 17:43 -0700, Andres Freund wrote:
> Currently this doesn't apply: 
> http://cfbot.cputube.org/patch_37_3394.log

Withdrawn for now. With custom WAL resource managers this is less
important to me.

I still think it has value, and I'm willing to work on it if more use
cases come forward.

Regards,
Jeff Davis






Re: shared-memory based stats collector - v70

2022-07-20 Thread Tom Lane
Melanie Plageman  writes:
> On Wed, Jul 20, 2022 at 11:35 AM Greg Stark  wrote:
>> It seems like if we really think the total number of database objects
>> is reasonably limited to scales that fit in RAM there would be a much
>> simpler database design that would just store the catalog tables in
>> simple in-memory data structures and map them all on startup without
>> doing all the work Postgres does to make relational storage scale.

> I think efforts to do such a thing have gotten caught up in solving
> issues around visibility and managing the relationship between local and
> global caches [1]. It doesn't seem like the primary technical concern
> was memory usage.

AFAIR, the previous stats collector implementation had no such provision
either: it'd just keep adding hashtable entries as it received info about
new objects.  The only thing that's changed is that now those entries are
in shared memory instead of process-local memory.  We'd be well advised to
be sure that memory can be swapped out under pressure, but otherwise I'm
not seeing that things have gotten worse.

regards, tom lane




Re: shared-memory based stats collector - v70

2022-07-20 Thread Andres Freund
Hi,

On 2022-07-20 11:35:13 -0400, Greg Stark wrote:
> Is it true that the shared memory allocation contains the hash table
> entry and body of every object in every database?

Yes. However, note that that was already the case with the old stats
collector - it also kept everything in memory. In addition every read
access to stats loaded a copy of the stats (well of the global stats and
the relevant per-database stats).

It might be worth doing something fancier at some point - the shared
memory stats was already a huge effort, cramming yet another change in
there would pretty much have guaranteed that it'd fail.

Greetings,

Andres Freund




Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Dave Page
On Wed, 20 Jul 2022 at 16:12, Tom Lane  wrote:

> Alvaro Herrera  writes:
> > On 2022-Jul-20, Tom Lane wrote:
> >> I'll try to do some research later today to identify anything else
> >> we need to mark in plpgsql.  I recall doing some work specifically
> >> creating functions for pldebugger's use, but I'll need to dig.
>
> > I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't
> > expose functions directly, but through plpgsql_plugin_ptr.  Maybe that
> > one does need to be made PGDLLEXPORT, since currently it isn't.
>
> After some experimentation, it does not need to be marked: pldebugger
> gets at that via find_rendezvous_variable(), so there is no need for
> any explicit linkage at all between plpgsql.so and plugin_debugger.so.
>
> Along the way, I made a quick hack to get pldebugger to load into
> v15/HEAD.  It lacks #ifdef's which'd be needed so that it'd still
> compile against older branches, but perhaps this'll save someone
> some time.
>

Thanks Tom - I've pushed that patch with the relevant #ifdefs added.

-- 
Dave Page
PostgreSQL Core Team
http://www.postgresql.org/


Re: shared-memory based stats collector - v70

2022-07-20 Thread Melanie Plageman
On Wed, Jul 20, 2022 at 11:35 AM Greg Stark  wrote:

> On the one hand the rest of Postgres seems to be designed on the
> assumption that the number of tables and database objects is limited
> only by disk space. The catalogs are stored in relational storage
> which is read through the buffer cache. On the other hand it's true
> that syscaches don't do expire entries (though I think the assumption
> is that no one backend touches very much).
>
> It seems like if we really think the total number of database objects
> is reasonably limited to scales that fit in RAM there would be a much
> simpler database design that would just store the catalog tables in
> simple in-memory data structures and map them all on startup without
> doing all the work Postgres does to make relational storage scale.
>

I think efforts to do such a thing have gotten caught up in solving
issues around visibility and managing the relationship between local and
global caches [1]. It doesn't seem like the primary technical concern
was memory usage.

[1]
https://www.postgresql.org/message-id/flat/4E72940DA2BF16479384A86D54D0988A567B9245%40G01JPEXMBKW04


Re: Allow placeholders in ALTER ROLE w/o superuser

2022-07-20 Thread Tom Lane
Kyotaro Horiguchi  writes:
> At Tue, 19 Jul 2022 09:53:39 -0700, Nathan Bossart  
> wrote in 
>> Hm.  I would expect ALTER ROLE to store the PGC_SUSET context when executed
>> by a superuser or a role with privileges via pg_parameter_acl.  Storing all
>> placeholder GUC settings as PGC_USERSET would make things more restrictive
>> than they are today.  For example, it would no longer be possible to apply
>> any ALTER ROLE settings from superusers for placeholders that later become
>> custom GUCS.

> Returning to the topic, that operation can be allowed in PG15, having
> being granted by superuser using the GRANT SET ON PARMETER command.

I think that 13d838815 has completely changed the terms that this
discussion needs to be conducted under.  It seems clear to me now
that if you want to relax this only-superusers restriction, what
you have to do is store the OID of the role that issued ALTER ROLE/DB SET,
and then apply the same checks that would be used in the ordinary case
where a placeholder is being filled in after being set intra-session.
That is, we'd no longer assume that a value coming from pg_db_role_setting
was set with superuser privileges, but we'd know exactly who did set it.

This might also tie into Nathan's question in another thread about
exactly what permissions should be required to issue ALTER ROLE/DB SET.
In particular I'm wondering if different permissions should be needed to
override an existing entry than if there is no existing entry.  If not,
we could find ourselves downgrading a superuser-set entry to a
non-superuser-set entry, which might have bad consequences later
(eg, by rendering the entry nonfunctional because when we actually
load the extension we find out the GUC is SUSET).

Possibly related to this: I felt while working on 13d838815 that
PGC_SUSET and PGC_SU_BACKEND should be usable as GucContext
values for GUC variables, indicating that the GUC requires special
privileges to be set, but we should no longer use them as passed-in
GucContext values.  That is, we should remove privilege tests from
the call sites, like this:

(void) set_config_option(stmt->name,
 ExtractSetVariableArgs(stmt),
-(superuser() ? PGC_SUSET : PGC_USERSET),
+PGC_USERSET,
 PGC_S_SESSION,
 action, true, 0, false);

and instead put that behavior inside set_config_option_ext, which
would want to look at superuser_arg(srole) instead, and indeed might
not need to do anything because pg_parameter_aclcheck would subsume
the test.  I didn't pursue this further because it wasn't essential
to fixing the bug.  But it seems relevant here, because that line of
thought leads to the conclusion that storing PGC_SUSET vs PGC_USERSET
is entirely the wrong approach.

There is a bunch of infrastructure work that has to be done if anyone
wants to make this happen:

* redesign physical representation of pg_db_role_setting

* be sure to clean up if a role mentioned in pg_db_role_setting is dropped

* pg_dump would need to be taught to dump the state of affairs correctly.

regards, tom lane




Re: shared-memory based stats collector - v70

2022-07-20 Thread Greg Stark
So I'm finally wrapping my head around this new code. There is
something I'm surprised by that perhaps I'm misreading or perhaps I
shouldn't be surprised by, not sure.

Is it true that the shared memory allocation contains the hash table
entry and body of every object in every database? I guess I was
assuming I would find some kind of LRU cache which loaded data from
disk on demand. But afaict it loads everything on startup and then
never loads from disk later. The disk is purely for recovering state
after a restart.

On the one hand the rest of Postgres seems to be designed on the
assumption that the number of tables and database objects is limited
only by disk space. The catalogs are stored in relational storage
which is read through the buffer cache. On the other hand it's true
that syscaches don't do expire entries (though I think the assumption
is that no one backend touches very much).

It seems like if we really think the total number of database objects
is reasonably limited to scales that fit in RAM there would be a much
simpler database design that would just store the catalog tables in
simple in-memory data structures and map them all on startup without
doing all the work Postgres does to make relational storage scale.




Re: Remove useless arguments in ReadCheckpointRecord().

2022-07-20 Thread Bharath Rupireddy
On Wed, Jul 20, 2022 at 8:21 PM Fujii Masao  wrote:
>
> Hi,
>
> I'd like to propose to remove "whichChkpt" and "report" arguments in 
> ReadCheckpointRecord(). "report" is obviously useless because it's always 
> true, i.e., there are two callers of the function and they always specify 
> true as "report".

Yes, the report parameter is obvious to delete. The commit 1d919de5eb
removed the only call with the report parameter as false.

> "whichChkpt" indicates where the specified checkpoint location came from, 
> pg_control or backup_label. This information is used to log different 
> messages as follows.
>
> switch (whichChkpt)
> {
> case 1:
> ereport(LOG,
> (errmsg("invalid primary 
> checkpoint link in control file")));
> break;
> default:
> ereport(LOG,
> (errmsg("invalid checkpoint 
> link in backup_label file")));
> break;
> }
> return NULL;
> ...
> switch (whichChkpt)
> {
> case 1:
> ereport(LOG,
> (errmsg("invalid primary 
> checkpoint record")));
> break;
> default:
> ereport(LOG,
> (errmsg("invalid checkpoint 
> record")));
> break;
> }
> return NULL;
> ...
>
> But the callers of ReadCheckpointRecord() already output different log 
> messages depending on where the invalid checkpoint record came from. So even 
> if ReadCheckpointRecord() doesn't use "whichChkpt", i.e., use the same log 
> message in both pg_control and backup_label cases, users can still identify 
> where the invalid checkpoint record came from, by reading the log message.
>
> Also when whichChkpt = 0, "primary checkpoint" is used in the log message and 
> sounds confusing because, as far as I recall correctly, we removed the 
> concept of primary and secondary checkpoints before.

Yes, using "primary checkpoint" confuses, as we usually refer primary
in the context of replication and HA.

>   Therefore I think that it's better to remove "whichChkpt" argument in 
> ReadCheckpointRecord().
>
> Patch attached. Thought?

How about we transform the following messages into something like below?

(errmsg("could not locate a valid checkpoint record"))); after
ReadCheckpointRecord() for control file cases to "could not locate
valid checkpoint record in control file"
(errmsg("could not locate required checkpoint record"), after
ReadCheckpointRecord() for backup_label case to "could not locate
valid checkpoint record in backup_label file"

The above messages give more meaningful and unique info to the users.

May be unrelated, IIRC, for the errors like ereport(PANIC,
(errmsg("could not locate a valid checkpoint record"))); we wanted to
add a hint asking users to consider running pg_resetwal to fix the
issue. The error for ReadCheckpointRecord() in backup_label file
cases, already gives a hint errhint("If you are restoring from a
backup, touch \"%s/recovery.signal\" .. Adding the hint of running
pg_resetwal (of course with a caution that it can cause inconsistency
in the data and use it as a last resort as described in the docs)
helps users and support engineers a lot to mitigate the server down
cases quickly.

Regards,
Bharath Rupireddy.




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-20 Thread Alvaro Herrera
On 2022-Jul-20, Alvaro Herrera wrote:

> I also change the use of allow_invalid_pages to
> allow_in_place_tablespaces.  We could add a
> separate GUC for this, but it seems overengineering.

Oh, but allow_in_place_tablespaces doesn't exist in versions 14 and
older, so this strategy doesn't really work.

I see the following alternatives:

1. not backpatch this fix to 14 and older
2. use a different GUC; either allow_invalid_pages as previously
   suggested, or create a new one just for this purpose
3. not provide any overriding mechanism in versions 14 and older

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Always assume the user will do much worse than the stupidest thing
you can imagine."(Julien PUYDT)




Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Tom Lane
Alvaro Herrera  writes:
> On 2022-Jul-20, Tom Lane wrote:
>> I'll try to do some research later today to identify anything else
>> we need to mark in plpgsql.  I recall doing some work specifically
>> creating functions for pldebugger's use, but I'll need to dig.

> I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't
> expose functions directly, but through plpgsql_plugin_ptr.  Maybe that
> one does need to be made PGDLLEXPORT, since currently it isn't.

After some experimentation, it does not need to be marked: pldebugger
gets at that via find_rendezvous_variable(), so there is no need for
any explicit linkage at all between plpgsql.so and plugin_debugger.so.

Along the way, I made a quick hack to get pldebugger to load into
v15/HEAD.  It lacks #ifdef's which'd be needed so that it'd still
compile against older branches, but perhaps this'll save someone
some time.

regards, tom lane

diff --git a/plugin_debugger.c b/plugin_debugger.c
index 6620021..1bd2057 100644
--- a/plugin_debugger.c
+++ b/plugin_debugger.c
@@ -114,6 +114,8 @@ static debugger_language_t *debugger_languages[] = {
 	NULL
 };
 
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+
 /**
  * Function declarations
  **/
@@ -124,6 +126,7 @@ void _PG_init( void );/* initialize this module when we are dynamically load
  * Local (hidden) function prototypes
  **/
 
+static void			 pldebugger_shmem_request( void );
 //static char   ** fetchArgNames( PLpgSQL_function * func, int * nameCount );
 static void* writen( int peer, void * src, size_t len );
 static bool 		 connectAsServer( void );
@@ -154,6 +157,15 @@ void _PG_init( void )
 	for (i = 0; debugger_languages[i] != NULL; i++)
 		debugger_languages[i]->initialize();
 
+	prev_shmem_request_hook = shmem_request_hook;
+	shmem_request_hook = pldebugger_shmem_request;
+}
+
+static void pldebugger_shmem_request( void )
+{
+	if (prev_shmem_request_hook)
+		prev_shmem_request_hook();
+
 	reserveBreakpoints();
 	dbgcomm_reserve();
 }


Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Andres Freund
Hi, 

On July 20, 2022 4:20:04 PM GMT+02:00, Tom Lane  wrote:
>Alvaro Herrera  writes:
>> I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't
>> expose functions directly, but through plpgsql_plugin_ptr.  Maybe that
>> one does need to be made PGDLLEXPORT, since currently it isn't.
>
>Hm.  Not sure if the rules are the same for global variables as
>they are for functions, but if so, yeah ...

They're the same on the export side. On windows the rules for linking to 
variables are stricter (they need declspec dllimport), but that doesn't matter 
for dlsym style stuff.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




[PATCH] Renumber confusing value for GUC_UNIT_BYTE

2022-07-20 Thread Justin Pryzby
The GUC units are currently defined like:

#define GUC_UNIT_KB 0x1000  /* value is in 
kilobytes */
#define GUC_UNIT_BLOCKS 0x2000  /* value is in blocks */
#define GUC_UNIT_XBLOCKS0x3000  /* value is in xlog blocks */
#define GUC_UNIT_MB 0x4000  /* value is in 
megabytes */
#define GUC_UNIT_BYTE   0x8000  /* value is in bytes */
#define GUC_UNIT_MEMORY 0xF000  /* mask for size-related units 
*/

#define GUC_UNIT_MS0x1  /* value is in 
milliseconds */
#define GUC_UNIT_S 0x2  /* value is in seconds 
*/
#define GUC_UNIT_MIN   0x3  /* value is in minutes */
#define GUC_UNIT_TIME  0xF  /* mask for time-related units 
*/

0x3000 and 0x3 seemed wrong, since they're a combination of other flags
rather than being an independant power of two.

But actually, these aren't flags:  they're tested in a "case" statement for
equality, not in a bitwise & test.

So the outlier is actually 0x8000, added at:
|commit 6e7baa322773ff8c79d4d8883c99fdeff5bfa679
|Author: Andres Freund 
|Date:   Tue Sep 12 12:13:12 2017 -0700
|
|Introduce BYTES unit for GUCs.

It looks like that originated here:

https://www.postgresql.org/message-id/CAOG9ApEu8bXVwBxkOO9J7ZpM76TASK_vFMEEiCEjwhMmSLiaqQ%40mail.gmail.com

commit 162e4838103e7957cccfe7868fc28397b55ca1d7
Author: Justin Pryzby 
Date:   Wed Jul 20 09:27:24 2022 -0500

Renumber confusing value for GUC_UNIT_BYTE

It had a power-of-two value, which looks right, and causes the other values
which aren't powers-of-two to look wrong.  But this is tested for equality 
and
not a bitwise test.

See also:
6e7baa322773ff8c79d4d8883c99fdeff5bfa679

https://www.postgresql.org/message-id/CAOG9ApEu8bXVwBxkOO9J7ZpM76TASK_vFMEEiCEjwhMmSLiaqQ%40mail.gmail.com

diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 4d0920c42e2..be928fac881 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -219,11 +219,12 @@ typedef enum
 #define GUC_DISALLOW_IN_AUTO_FILE 0x0800   /* can't set in

 * PG_AUTOCONF_FILENAME */
 
+/* GUC_UNIT_* are not flags - they're tested for equality */
 #define GUC_UNIT_KB0x1000  /* value is in 
kilobytes */
 #define GUC_UNIT_BLOCKS0x2000  /* value is in blocks */
 #define GUC_UNIT_XBLOCKS   0x3000  /* value is in xlog blocks */
 #define GUC_UNIT_MB0x4000  /* value is in 
megabytes */
-#define GUC_UNIT_BYTE  0x8000  /* value is in bytes */
+#define GUC_UNIT_BYTE  0x5000  /* value is in bytes */
 #define GUC_UNIT_MEMORY0xF000  /* mask for 
size-related units */
 
 #define GUC_UNIT_MS   0x1  /* value is in 
milliseconds */




Remove useless arguments in ReadCheckpointRecord().

2022-07-20 Thread Fujii Masao

Hi,

I'd like to propose to remove "whichChkpt" and "report" arguments in ReadCheckpointRecord(). 
"report" is obviously useless because it's always true, i.e., there are two callers of the function and they 
always specify true as "report".

"whichChkpt" indicates where the specified checkpoint location came from, 
pg_control or backup_label. This information is used to log different messages as follows.

switch (whichChkpt)
{
case 1:
ereport(LOG,
(errmsg("invalid primary checkpoint 
link in control file")));
break;
default:
ereport(LOG,
(errmsg("invalid checkpoint link in 
backup_label file")));
break;
}
return NULL;
...
switch (whichChkpt)
{
case 1:
ereport(LOG,
(errmsg("invalid primary checkpoint 
record")));
break;
default:
ereport(LOG,
(errmsg("invalid checkpoint 
record")));
break;
}
return NULL;
...

But the callers of ReadCheckpointRecord() already output different log messages depending 
on where the invalid checkpoint record came from. So even if ReadCheckpointRecord() 
doesn't use "whichChkpt", i.e., use the same log message in both pg_control and 
backup_label cases, users can still identify where the invalid checkpoint record came 
from, by reading the log message.

Also when whichChkpt = 0, "primary checkpoint" is used in the log message and 
sounds confusing because, as far as I recall correctly, we removed the concept of primary 
and secondary checkpoints before.

 Therefore I think that it's better to remove "whichChkpt" argument in 
ReadCheckpointRecord().

Patch attached. Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONFrom a33e557a18cf5c45bbcf47f1cf92e26998f1cd67 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Wed, 20 Jul 2022 23:06:44 +0900
Subject: [PATCH] Remove useless arguments in ReadCheckpointRecord().

---
 src/backend/access/transam/xlogrecovery.c | 84 ---
 1 file changed, 15 insertions(+), 69 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c 
b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5e46..e383c2123a 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -422,8 +422,8 @@ static XLogPageReadResult 
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr,

  XLogRecPtr replayLSN,

  bool nonblocking);
 static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
-static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, 
XLogRecPtr RecPtr,
-   
int whichChkpt, bool report, TimeLineID replayTLI);
+static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher,
+   
XLogRecPtr RecPtr, TimeLineID replayTLI);
 static bool rescanLatestTimeLine(TimeLineID replayTLI, XLogRecPtr replayLSN);
 static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 XLogSource source, bool 
notfoundOk);
@@ -605,7 +605,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
 * When a backup_label file is present, we want to roll forward 
from
 * the checkpoint it identifies, rather than using pg_control.
 */
-   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 0, 
true,
+   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
  
CheckPointTLI);
if (record != NULL)
{
@@ -744,7 +744,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
RedoStartLSN = ControlFile->checkPointCopy.redo;
RedoStartTLI = ControlFile->checkPointCopy.ThisTimeLineID;
-   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 1, 
true,
+   record = 

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-07-20 Thread Önder Kalacı
Hi,


> > - Add a new class of invalidation callbacks: Today, if we do ALTER TABLE
> or CREATE INDEX on a table, the CacheRegisterRelcacheCallback helps us to
> re-create the cache entries. In this case, as far as I can see, we need a
> callback that is called when table "ANALYZE"d, because that is when the
> statistics change. That is the time picking a new index makes sense.
> > However, that seems like adding another dimension to this patch, which I
> can try but also see that committing becomes even harder.
> >
>
> This idea sounds worth investigating. I see that this will require
> more work but OTOH, we can't allow the existing system to regress
> especially because depending on workload it might regress badly.


Just to note if that is not clear: This patch avoids (or at least aims to
avoid assuming no bugs) changing the behavior of the existing systems with
PRIMARY KEY or UNIQUE index. In that case, we still use the relevant
indexes.


> We
> can create a patch for this atop the base patch for easier review/test
> but I feel we need some way to address this point.
>
>
One another idea could be to re-calculate the index, say after *N*
updates/deletes for the table. We may consider using subscription_parameter
for getting N -- with a good default, or even hard-code into the code. I
think the cost of re-calculating should really be pretty small compared to
the other things happening during logical replication. So, a sane default
might work?

If you think the above doesn't work, I can try to work on a separate patch
which adds something like "analyze invalidation callback".



> >
> > - Ask users to manually pick the index they want to use: Currently, the
> main complexity of the patch comes with the planner related code. In fact,
> if you look into the logical replication related changes, those are
> relatively modest changes. If we can drop the feature that Postgres picks
> the index, and provide a user interface to set the indexes per table in the
> subscription, we can probably have an easier patch to review & test. For
> example, we could add `ALTER SUBSCRIPTION sub ALTER TABLE t USE INDEX i`
> type of an API. This also needs some coding, but probably much simpler than
> the current code. And, obviously, this pops up the question of can users
> pick the right index?
> >
>
> I think picking the right index is one point and another is what if
> the subscription has many tables (say 10K or more), doing it for
> individual tables per subscription won't be fun. Also, users need to
> identify which tables belong to a particular subscription, now, users
> can find the same via pg_subscription_rel or some other way but doing
> this won't be straightforward for users. So, my inclination would be
> to pick the right index automatically rather than getting the input
> from the user.
>

Yes, all makes sense.


>
> Now, your point related to planner code in the patch bothers me as
> well but I haven't studied the patch in detail to provide any
> alternatives at this stage. Do you have any other ideas to make it
> simpler or solve this problem in some other way?
>
>
One idea I tried earlier was to go over the existing indexes and on the
table, then get the IndexInfo via BuildIndexInfo(). And then, try to find a
good heuristic to pick an index. In the end, I felt like that is doing a
sub-optimal job, requiring a similar amount of code of the current patch,
and still using the similar infrastructure.

My conclusion for that route was I should either use a very simple
heuristic (like pick the index with the most columns) and have a suboptimal
index pick, OR use a complex heuristic with a reasonable index pick. And,
the latter approach converged to the planner code in the patch. Do you
think the former approach is acceptable?

Thanks,
Onder


Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Tom Lane
Alvaro Herrera  writes:
> I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't
> expose functions directly, but through plpgsql_plugin_ptr.  Maybe that
> one does need to be made PGDLLEXPORT, since currently it isn't.

Hm.  Not sure if the rules are the same for global variables as
they are for functions, but if so, yeah ...

regards, tom lane




Re: Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Tom Lane
Andres Freund  writes:
> On July 20, 2022 3:54:03 PM GMT+02:00, Robert Haas  
> wrote:
>> The name PGDLLEXPORT is actually slightly misleading, now, because
>> there's no longer anything about it that is specific to DLLs.

> How so? Right now it's solely used for making symbols in DLLs as exported?

I suspect Robert is reading "DLL" as meaning only a Windows thing.
You're right, if you read it as a generic term for loadable libraries,
it's more or less applicable everywhere.

regards, tom lane




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2022-07-20 Thread Aleksander Alekseev
Hi Pavel,

> Rebased. PFA v13.
> Your reviews are very much welcome!

I noticed that this patch is in "Needs Review" state and it has been
stuck for some time now, so I decided to take a look.

```
+SELECT bt_index_parent_check('bttest_a_idx', true, true, true);
+SELECT bt_index_parent_check('bttest_b_idx', true, false, true);
``

1. This "true, false, true" sequence is difficult to read. I suggest
we use named arguments here.

2. I believe there are some minor issues with the comments. E.g. instead of:

- First key on next page is same
- Make values 768 and 769 looks equal

I would write:

- The first key on the next page is the same
- Make values 768 and 769 look equal

There are many little errors like these.

```
+# Copyright (c) 2021, PostgreSQL Global Development Group
```

3. Oh no. The copyright has expired!

```
+  true.  When checkunique
+  is true bt_index_check will
```

4. This piece of documentation was copy-pasted between two functions
without change of the function name.

Other than that to me the patch looks in pretty good shape. Here is
v14 where I fixed my own nitpicks, with the permission of Pavel given
offlist.

If no one sees any other defects I'm going to change the status of the
patch to "Ready to Committer" in a short time.

--
Best regards,
Aleksander Alekseev


v14-0001-Add-option-for-amcheck-and-pg_amcheck-to-check-u.patch
Description: Binary data


Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Andres Freund
Hi,

On July 20, 2022 3:54:03 PM GMT+02:00, Robert Haas  
wrote:
>On Wed, Jul 20, 2022 at 9:39 AM Tom Lane  wrote:
>> ISTM that a comment pointing out that the functions marked PGDLLEXPORT
>> are meant to be externally accessible should be sufficient.
>
>The name PGDLLEXPORT is actually slightly misleading, now, because
>there's no longer anything about it that is specific to DLLs.

How so? Right now it's solely used for making symbols in DLLs as exported?

Andres 
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-20 Thread Amit Langote
On Wed, Jul 20, 2022 at 12:37 AM Andres Freund  wrote:
> On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
> > About that, I was wondering if the blocks in llvm_compile_expr() need
> > to be hand-coded to match what's added in ExecInterpExpr() or if I've
> > missed some tool that can be used instead?
>
> The easiest way is to just call an external function for the implementation of
> the step. But yes, otherwise you need to handcraft it.

Ok, thanks.

So I started updating llvm_compile_expr() for handling the new
ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
realized that code could have been consolidated into less code, or
IOW, into fewer new ExprEvalSteps.  So, I refactored things that way
and am now retrying adding the code to llvm_compile_expr() based on
new, better consolidated, code.

Here's the updated version, without the llvm pieces, in case you'd
like to look at it even in this state.  I'll post a version with llvm
pieces filled in tomorrow.   (I have merged the different patches into
one for convenience.)

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v3-0001-Some-improvements-to-JsonExpr-evaluation.patch
Description: Binary data


Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jul 20, 2022 at 9:39 AM Tom Lane  wrote:
>> ISTM that a comment pointing out that the functions marked PGDLLEXPORT
>> are meant to be externally accessible should be sufficient.

> The name PGDLLEXPORT is actually slightly misleading, now, because
> there's no longer anything about it that is specific to DLLs.

True, but the mess of changing it seems to outweigh any likely clarity
gain.  As long as there's adequate commentary about what it means,
I'm okay with the existing naming.

regards, tom lane




Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Alvaro Herrera
On 2022-Jul-20, Tom Lane wrote:

> I'll try to do some research later today to identify anything else
> we need to mark in plpgsql.  I recall doing some work specifically
> creating functions for pldebugger's use, but I'll need to dig.

I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't
expose functions directly, but through plpgsql_plugin_ptr.  Maybe that
one does need to be made PGDLLEXPORT, since currently it isn't.

That was also reported by Pavel.  He was concerned about plpgsql_check,
though, not pldebugger.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree.  (Don Knuth)




Re: Memory leak fix in psql

2022-07-20 Thread Tom Lane
Japin Li  writes:
> Thanks for updating the patch.  It looks good.  However, it cannot be
> applied on 14 stable.  The attached patches are for 10-14.

While I think this is good cleanup, I'm doubtful that any of these
leaks are probable enough to be worth back-patching into stable
branches.  The risk of breaking something should not be neglected.

regards, tom lane




Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Robert Haas
On Wed, Jul 20, 2022 at 9:39 AM Tom Lane  wrote:
> ISTM that a comment pointing out that the functions marked PGDLLEXPORT
> are meant to be externally accessible should be sufficient.

The name PGDLLEXPORT is actually slightly misleading, now, because
there's no longer anything about it that is specific to DLLs.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Remove fls(), use pg_bitutils.h facilities instead?

2022-07-20 Thread Tom Lane
Thomas Munro  writes:
> That double eval macro wasn't nice.  This time with a static inline function.

Seems like passing a size_t to pg_leftmost_one_pos32 isn't great.
It was just as wrong before (if the caller-supplied argument is
indeed a size_t), but no time like the present to fix it.

We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t
as the appropriate one of pg_leftmost_one_pos32/64, perhaps.

regards, tom lane




Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Tom Lane
Alvaro Herrera  writes:
> No immediate plans for splitting plpgsql.h, so if anyone wants to take a
> stab at that, be my guest.

ISTM that a comment pointing out that the functions marked PGDLLEXPORT
are meant to be externally accessible should be sufficient.

I'll try to do some research later today to identify anything else
we need to mark in plpgsql.  I recall doing some work specifically
creating functions for pldebugger's use, but I'll need to dig.

regards, tom lane




Re: GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.

2022-07-20 Thread Alvaro Herrera
Hello

On 2022-Jul-20, Amul Sul wrote:

> If you look at GetFlushRecPtr() function the OUT parameter for
> TimeLineID is optional and this is not only one, see
> GetWalRcvFlushRecPtr(), GetXLogReplayRecPtr(), etc.
> 
> I think we have missed that for GetStandbyFlushRecPtr(), to be
> inlined, we should change this as follow:

This is something we decide mostly on a case-by-case basis.  There's no
fixed rule that all out params have to be optional.

If anything is improved by this change, let's see what it is.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: Remove fls(), use pg_bitutils.h facilities instead?

2022-07-20 Thread Tom Lane
Thomas Munro  writes:
> On Wed, Jul 20, 2022 at 4:52 PM Tom Lane  wrote:
>> I think we could probably just drop fls() entirely.  It doesn't look
>> to me like any of the existing callers expect a zero argument, so they
>> could be converted to use pg_leftmost_one_pos32() pretty trivially.

> That was not true for the case in contiguous_pages_to_segment_bin(),
> in dsa.c.  If it's just one place like that (and, hrrm, curiously
> there is an open issue about binning quality on my to do list...),

How is it sane to ask for a segment bin for zero pages?  Seems like
something should have short-circuited such a case well before here.

regards, tom lane




Re: PATCH: Add Table Access Method option to pgbench

2022-07-20 Thread Alexander Korotkov
Hi!

On Tue, Jul 19, 2022 at 4:47 AM Michael Paquier  wrote:
> On Mon, Jul 18, 2022 at 01:53:21PM +0300, Alexander Korotkov wrote:
> > Looks good to me as well.  I'm going to push this if no objections.
>
> FWIW, I find the extra mention of PGOPTIONS with the specific point of
> table AMs added within the part of the environment variables a bit
> confusing, because we already mention PGOPTIONS for serializable
> transactions a bit down.  Hence, my choice would be the addition of an
> extra paragraph in the "Notes", named "Table Access Methods", just
> before or after "Good Practices".  My 2c.

Thank you.  Pushed applying the suggestion above.

--
Regards,
Alexander Korotkov




Re: Allowing REINDEX to have an optional name

2022-07-20 Thread Justin Pryzby
On Tue, Jul 19, 2022 at 01:13:34PM +0900, Michael Paquier wrote:
> > It looks like you named the table "toast_relfilenodes", but then also store
> > to it data for non-toast tables.
> 
> How about naming that index_relfilenodes?  One difference with what I
> posted previously and 5fb5b6 is the addition of an extra regclass that
> stores the parent table, for reference in the output.

Looks fine

> - 'CREATE TABLE toast_relfilenodes (parent regclass, indname regclass, 
> relfilenode oid);'
> + 'CREATE TABLE index_relfilenodes (parent regclass, indname regclass, 
> relfilenode oid);'

-- 
Justi




Re: Making pg_rewind faster

2022-07-20 Thread Justin Kwan
Hi Michael,

Thank you for your feedback, I've incorporated your suggestions by scanning the 
logs produced from pg_rewind when asserting that certain WAL segment files were 
skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch 
(version 16 to be). Please see attached.

Thanks,
Justin




From: Michael Paquier
Sent: Tuesday, July 19, 2022 1:36 AM
To: Justin Kwan
Cc: Tom Lane; pgsql-hackers; vignesh; jk...@cloudflare.com; vignesh 
ravichandran; hlinn...@iki.fi
Subject: Re: Making pg_rewind faster

On Mon, Jul 18, 2022 at 05:14:00PM +, Justin Kwan wrote:
> Thank you for taking a look at this and that sounds good. I will
> send over a patch compatible with Postgres v16.

+$node_2->psql(
+   'postgres',
+   "SELECT extract(epoch from modification) FROM 
pg_stat_file('pg_wal/00010003');",
+   stdout => \my $last_common_tli1_wal_last_modified_at);
Please note that you should not rely on the FS-level stats for
anything that touches the WAL segments.  A rough guess about what you
could here to make sure that only the set of WAL segments you are
looking for is being copied over would be to either:
- Scan the logs produced by pg_rewind and see if the segments are
copied or not, depending on the divergence point (aka the last
checkpoint before WAL forked).
- Clean up pg_wal/ in the target node before running pg_rewind,
checking that only the segments you want are available once the
operation completes.
--
Michael


v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patch
Description:  v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patch


Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

2022-07-20 Thread Bharath Rupireddy
On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi
 wrote:
>
> At Wed, 20 Jul 2022 10:02:22 +0530, Bharath Rupireddy 
>  wrote in
> > Done. PSA v6 patch set.
>
> Thanks!
>
> -Specifies the minimum size of past log file segments kept in the
> +Specifies the minimum size of past WAL files kept in the
>
> -log file by reducing the value of this parameter. On a system with
> +file by reducing the value of this parameter. On a system with
>
> Looks fine. And postgresq.conf.sample has the following lines:
>
> #archive_library = ''   # library to use to archive a logfile segment
>
> #archive_command = ''   # command to use to archive a logfile segment
>
> #archive_timeout = 0# force a logfile segment switch after this
>
> #restore_command = ''   # command to use to restore an archived 
> logfile segment
>
> Aren't they need the same fix?

Indeed. Thanks. Now, they are in sync with their peers in .conf.sample
file as well as description in guc.c.

PSA v7 patch set.

Regards,
Bharath Rupireddy.


v7-0001-Use-WAL-segment-instead-of-log-segment.patch
Description: Binary data


v7-0002-Consistently-use-WAL-file-s-in-docs.patch
Description: Binary data


Re: Windows default locale vs initdb

2022-07-20 Thread Thomas Munro
On Wed, Jul 20, 2022 at 10:27 PM Juan José Santamaría Flecha
 wrote:
> On Tue, Jul 19, 2022 at 4:47 AM Thomas Munro  wrote:
>> As for whether "accordingly" still applies, by the logic of of
>> win32_langinfo()...  Windows still considers WIN1252 to be the default
>> ANSI code page for "en-US", though it'd work with UTF-8 too.  I'm not
>> sure what to make of that.  The goal here was to give Windows users
>> good defaults, but WIN1252 is probably not what most people actually
>> want.  Hmph.
>
>
> Still, WIN1252 is not the wrong answer for what we are asking. Even if you 
> enable UTF-8 support [1], the system will use the current default Windows 
> ANSI code page (ACP) for the locale and UTF-8 for the code page.

I'm still confused about what that means.  Suppose we decided to
insist by adding a ".UTF-8" suffix to the name, as that page says we
can now that we're on Windows 10+, when building the default locale
name (see experimental 0002 patch, attached).  It initially seemed to
have the right effect:

The database cluster will be initialized with locale "en-US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

But then the Turkish i test in contrib/citext/sql/citext_utf8.sql failed[1]:

SELECT 'i'::citext = 'İ'::citext AS t;
 t
 ---
- t
+ f
 (1 row)

About the pg_upgrade problem, maybe it's OK ... existing old format
names should continue to work, but we can still remove the weird code
that does locale name tweaking, right?  pg_upgraded databases should
contain fixed names (ie that were fixed by old initdb so should
continue to work), and new clusters will get BCP 47 names.

I don't really know, I was just playing with rough ideas by sending
patches to CI here...

[1] https://cirrus-ci.com/task/6423238052937728
From b007eb45e575956d5035f4152f72177abddc2762 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 19 Jul 2022 06:31:17 +1200
Subject: [PATCH v3 1/3] Default to BCP 47 locale in initdb on Windows.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Avoid selecting traditional Windows locale names written with English
words, because they are unstable and not recommended for use in
databases.  Since setlocale() returns such names, on Windows use
GetUserDefaultLocaleName() if the user didn't provide an explicit
locale.

Also update the documentation to recommend BCP 47 over the traditional
names when providing explicit values to initdb.

Reviewed-by: Juan José Santamaría Flecha 
Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com
---
 doc/src/sgml/charset.sgml | 10 --
 src/bin/initdb/initdb.c   | 31 +--
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 445fd175d8..b656ca489f 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -83,8 +83,14 @@ initdb --locale=sv_SE
 system under what names depends on what was provided by the operating
 system vendor and what was installed.  On most Unix systems, the command
 locale -a will provide a list of available locales.
-Windows uses more verbose locale names, such as German_Germany
-or Swedish_Sweden.1252, but the principles are the same.
+   
+
+   
+Windows uses BCP 47 language tags, like ICU.
+For example, sv-SE represents Swedish as spoken in Sweden.
+Windows also supports more verbose locale names based on English words,
+such as German_Germany or Swedish_Sweden.1252,
+but these are not recommended.

 

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 89b888eaa5..3af08b7b99 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -59,6 +59,10 @@
 #include "sys/mman.h"
 #endif
 
+#ifdef WIN32
+#include 
+#endif
+
 #include "access/xlog_internal.h"
 #include "catalog/pg_authid_d.h"
 #include "catalog/pg_class_d.h" /* pgrminclude ignore */
@@ -2007,6 +2011,7 @@ locale_date_order(const char *locale)
 static void
 check_locale_name(int category, const char *locale, char **canonname)
 {
+	char	   *locale_copy;
 	char	   *save;
 	char	   *res;
 
@@ -2022,10 +2027,30 @@ check_locale_name(int category, const char *locale, char **canonname)
 
 	/* for setlocale() call */
 	if (!locale)
-		locale = "";
+	{
+#ifdef WIN32
+		wchar_t		wide_name[LOCALE_NAME_MAX_LENGTH];
+		char		name[LOCALE_NAME_MAX_LENGTH];
+
+		/* use Windows API to find the default in BCP47 format */
+		if (GetUserDefaultLocaleName(wide_name, LOCALE_NAME_MAX_LENGTH) == 0)
+			pg_fatal("failed to get default locale name: error code %lu",
+	 GetLastError());
+		if (WideCharToMultiByte(CP_ACP, 0, wide_name, -1, name,
+LOCALE_NAME_MAX_LENGTH, NULL, NULL) == 0)
+			pg_fatal("failed to convert locale name: error code %lu",
+	 GetLastError());
+		locale_copy = pg_strdup(name);
+#else
+		/* use 

Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?

2022-07-20 Thread Bharath Rupireddy
Hi,

After the commit [1], is it correct to say errmsg("invalid data in file
\"%s\"", BACKUP_LABEL_FILE))); in do_pg_backup_stop() when we hold the
contents in backend global memory, not actually reading from backup_label
file? However, it is correct to say that in read_backup_label.

IMO, we can either say "invalid backup_label contents found" or we can be
more descriptive and say "invalid "START WAL LOCATION" line found in
backup_label content" and "invalid "BACKUP FROM" line found in
backup_label content" and so on.

Thoughts?

[1]
commit 39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a
Author: Stephen Frost 
Date:   Wed Apr 6 14:41:03 2022 -0400

Remove exclusive backup mode

errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE)));

Regards,
Bharath Rupireddy.


Re: GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.

2022-07-20 Thread Aleksander Alekseev
Hi Amul,

> -   *tli = replayTLI;
> +   if (tli)
> +   *tli = replayTLI;

I would guess the difference here is that GetStandbyFlushRecPtr is
static. It is used 3 times in walsender.c and in all cases the
argument can't be NULL.

So I'm not certain what we will gain from the proposed check.

-- 
Best regards,
Aleksander Alekseev




GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.

2022-07-20 Thread Amul Sul
Hi,

If you look at GetFlushRecPtr() function the OUT parameter for
TimeLineID is optional and this is not only one, see
GetWalRcvFlushRecPtr(), GetXLogReplayRecPtr(), etc.

I think we have missed that for GetStandbyFlushRecPtr(), to be
inlined, we should change this as follow:

--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -3156,7 +3156,8 @@ GetStandbyFlushRecPtr(TimeLineID *tli)
receivePtr = GetWalRcvFlushRecPtr(NULL, );
replayPtr = GetXLogReplayRecPtr();

-   *tli = replayTLI;
+   if (tli)
+   *tli = replayTLI;

Thoughts?
--
Regards,
Amul Sul
EDB: http://www.enterprisedb.com




Re: Collect ObjectAddress for ATTACH DETACH PARTITION to use in event trigger

2022-07-20 Thread Amit Kapila
On Fri, Jul 15, 2022 at 11:39 AM houzj.f...@fujitsu.com
 wrote:
>
> On Friday, July 15, 2022 11:41 AM Michael Paquier  wrote:
>
> Hi,
> >
> > On Fri, Jul 15, 2022 at 03:21:30AM +, kuroda.hay...@fujitsu.com wrote:
> > > Sounds good. I grepped ATExecXXX() functions called in ATExecCmd(),
> > > and I confirmed that all returned values have been collected except them.
> > >
> > > While checking test code test about EVENT TRIGGER, I found there were
> > > no tests related with partitions in that.
> > > How about adding them?
> >
> > Agreed.  It would be good to track down what changes once those
> > ObjectAddresses are collected.
>
> Thanks for having a look. It was a bit difficult to add a test for this.
> Because we currently don't have a user function which can return these
> collected ObjectAddresses for ALTER TABLE. And It seems we don't have tests 
> for
> already collected ObjectAddresses as well :(
>
> The collected ObjectAddresses is in
> "currentEventTriggerState->currentCommand->d.alterTable.subcmds.address" while
> the public function pg_event_trigger_ddl_commands doesn't return these
> information. It can only be used in user defined event trigger function (C
> code).
>
> If we want to add some tests for both already existed and newly added
> ObjectAddresses, we might need to add some test codes in test_ddl_deparse.c.
> What do you think ?
>

Right. But, I noticed that get_altertable_subcmdtypes() doesn't handle
AT_AttachPartition or AT_DetachPartition. We can handle those and at
least have a test for those in test_ddl_deparse\sql\slter_table.sql. I
know it is not directly related to your patch but that way we will at
least have some tests for Attach/Detach partition and in the future
when we extend it to test ObjectAddresses of subcommands that would be
handy. Feel free to write a separate patch for the same.

-- 
With Regards,
Amit Kapila.




Re: Make name optional in CREATE STATISTICS

2022-07-20 Thread Matthias van de Meent
On Wed, 13 Jul 2022 at 08:07, Simon Riggs  wrote:
>
> On Thu, 7 Jul 2022 at 11:58, Matthias van de Meent
>  wrote:
> > A more correct version would be
> >
> > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
> > [(stat types)]
>
> There you go

Thanks!

I think this is ready for a committer, so I've marked it as such.

Kind regards,

Matthias van de Meent




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-20 Thread Alvaro Herrera
v26 here.  I spent some time fighting the readdir() stuff for
Windows (so that get_dirent_type returns LNK for junction points)
but couldn't make it to work and was unable to figure out why.
So I ended up doing what do_pg_backup_start is already doing:
an #ifdef to call pgwin32_is_junction instead.  I remove the
newly added path_is_symlink function, because I realized that
it would mean an extra syscall everywhere other than Windows.

So if somebody wants to fix get_dirent_type() so that it works properly
on Windows, we can change all these places together.

I also change the use of allow_invalid_pages to
allow_in_place_tablespaces.  We could add a
separate GUC for this, but it seems overengineering.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."   (Nathaniel Smith)
>From 26a0be53592a20aa09501e9015f77a4b3c3c3302 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 13 Jul 2022 18:14:18 +0200
Subject: [PATCH v26] Fix replay of create database records on standby

Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the
standby would fail to recover in such a case.  However, the
directories could be legitimately missing.  Consider a sequence of WAL
records as follows:

CREATE DATABASE
DROP DATABASE
DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch allows missing tablespaces to be created during recovery
before reaching consistency.  The tablespaces are created as real
directories that should not exists but will be removed until reaching
consistency. CheckRecoveryConsistency is responsible to make sure they
have disappeared.

The problems detected by this new code are reported as PANIC, except
when allow_in_place_tablespaces is set to ON, in which case they are
WARNING.  Apart from making tests possible, this gives users an escape
hatch in case things don't go as planned.

Diagnosed-by: Paul Guo 
Author: Paul Guo 
Author: Kyotaro Horiguchi 
Author: Asim R Praveen 
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27dn5xwj2p9-roh16e4...@mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c   |  54 +++
 src/backend/commands/dbcommands.c   |  77 ++
 src/backend/commands/tablespace.c   |  28 +---
 src/test/recovery/t/033_replay_tsp_drops.pl | 155 
 4 files changed, 287 insertions(+), 27 deletions(-)
 create mode 100644 src/test/recovery/t/033_replay_tsp_drops.pl

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5e46..850ab6d7e6 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -42,6 +42,7 @@
 #include "access/xlogutils.h"
 #include "catalog/pg_control.h"
 #include "commands/tablespace.h"
+#include "common/file_utils.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/bgwriter.h"
@@ -2008,6 +2009,51 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
 	}
 }
 
+/*
+ * Verify that, in non-test mode, ./pg_tblspc doesn't contain any real
+ * directories.
+ *
+ * Replay of database creation XLOG records for databases that were later
+ * dropped can create fake directories in pg_tblspc.  By the time consistency
+ * is reached these directories should have been removed; here we verify
+ * that this did indeed happen.  This is to be called at the point where
+ * consistent state is reached.
+ *
+ * allow_in_place_tablespaces turns the PANIC into a WARNING, which is
+ * useful for testing purposes, and also allows for an escape hatch in case
+ * things go south.
+ */
+static void
+CheckTablespaceDirectory(void)
+{
+	DIR		   *dir;
+	struct dirent *de;
+
+	dir = AllocateDir("pg_tblspc");
+	while ((de = ReadDir(dir, "pg_tblspc")) != NULL)
+	{
+		char		path[MAXPGPATH + 10];
+
+		/* Skip entries of non-oid names */
+		if (strspn(de->d_name, "0123456789") != strlen(de->d_name))
+			continue;
+
+		snprintf(path, sizeof(path), "pg_tblspc/%s", de->d_name);
+
+#ifdef WIN32
+		if (!pgwin32_is_junction(path))
+#else
+		if (get_dirent_type(path, de, false, ERROR) != PGFILETYPE_LNK)
+#endif
+			ereport(allow_in_place_tablespaces ? WARNING : PANIC,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("unexpected directory entry \"%s\" found in %s",
+			de->d_name, "pg_tblspc/"),
+	 errdetail("All directory entries in pg_tblspc/ should be symbolic links."),
+	 errhint("Remove those directories, or set allow_in_place_tablespaces to ON transiently to let recovery complete.")));
+	}
+}
+
 /*
  * Checks if recovery has reached a consistent state. When consistency is
  * reached and we have 

Re: Windows default locale vs initdb

2022-07-20 Thread Juan José Santamaría Flecha
On Tue, Jul 19, 2022 at 4:47 AM Thomas Munro  wrote:

> As for whether "accordingly" still applies, by the logic of of
> win32_langinfo()...  Windows still considers WIN1252 to be the default
> ANSI code page for "en-US", though it'd work with UTF-8 too.  I'm not
> sure what to make of that.  The goal here was to give Windows users
> good defaults, but WIN1252 is probably not what most people actually
> want.  Hmph.
>

Still, WIN1252 is not the wrong answer for what we are asking. Even if you
enable UTF-8 support [1], the system will use the current default Windows
ANSI code page (ACP) for the locale and UTF-8 for the code page.

[1]
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170

Regards,

Juan José Santamaría Flecha


Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-07-20 Thread Richard Guo
On Wed, Jul 20, 2022 at 11:03 AM David Rowley  wrote:

> I think we can relax this now that we have incremental sort.  I think
> a better way to limit this is to allow a prefix of the query_pathkeys
> providing that covers *all* of the join pathkeys.  That way, for the
> above query, it leaves it open for the planner to do the Merge Join by
> sorting by a.a DESC then just do an Incremental Sort to get the
> GroupAggregate input sorted by a.b.


So the idea is if the ECs used by the mergeclauses are prefix of the
query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why
not relax this further that if the ECs in the mergeclauses and the
query_pathkeys have common prefix, we use that prefix as pathkeys? So
that we can have a plan like below:

# explain (costs off) select * from t1 join t2 on t1.c = t2.c and t1.a =
t2.a order by t1.a DESC, t1.b;
  QUERY PLAN
---
 Incremental Sort
   Sort Key: t1.a DESC, t1.b
   Presorted Key: t1.a
   ->  Merge Join
 Merge Cond: ((t1.a = t2.a) AND (t1.c = t2.c))
 ->  Sort
   Sort Key: t1.a DESC, t1.c
   ->  Seq Scan on t1
 ->  Sort
   Sort Key: t2.a DESC, t2.c
   ->  Seq Scan on t2
(11 rows)

Thanks
Richard


Re: Pluggable toaster

2022-07-20 Thread Nikita Malakhov
Hi hackers!

We really need your feedback on the last patchset update!

On a previous question about TOAST API overhead - please check script in
attach, we tested INSERT, UPDATE and SELECT
operations, and ran it on vanilla master and on patched master (vanilla
with untouched TOAST implementation and patched
with default TOAST implemented via TOAST API, in this patch set - with
patches up to 0005_bytea_appendable_toaster installed).
Some of the test scripts will be included in the patch set later, as an
additional patch.

Currently I'm working on an update to the default Toaster (some internal
optimizations, not affecting functionality)
and readme files explaining Pluggable TOAST.

An example of using custom Toaster:

Custom Toaster extension definition (developer):
CREATE FUNCTION custom_toaster_handler(internal)
RETURNS toaster_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

CREATE TOASTER custom_toaster HANDLER custom_toaster_handler;

User's POV:
CREATE EXTENSION custom_toaster;
select * from pg_toaster;
  oid  |tsrname |   tsrhandler
---++-
  9864 | deftoaster | default_toaster_handler
 32772 | custom_toaster | custom_toaster_handler


CREATE TABLE tst1 (
c1 text STORAGE plain,
c2 text STORAGE external TOASTER custom_toaster,
id int4
);
ALTER TABLE tst1 ALTER COLUMN c1 SET TOASTER custom_toaster;
=# \d+ tst1
 Column |  Type   | Collation | Nullable | Default | Storage  |  Toaster
|...
+-+---+--+-+--++...
 c1 | text|   |  | | plain| deftoaster
|...
 c2 | text|   |  | | external |
custom_toaster |...
 id | integer |   |  | | plain|
   |...
Access method: heap

Thanks!

Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

On Wed, Jul 13, 2022 at 10:45 PM Nikita Malakhov  wrote:

> Hi hackers!
> According to previous requests the patch branch was cleaned up from
> garbage, logs, etc. All conflicts' resolutions were merged
> into patch commits where they appear, branch was rebased to present one
> commit for one patch. The branch was actualized,
> and a fresh patch set was generated.
> https://github.com/postgrespro/postgres/tree/toasterapi_clean
>
> What we propose in short:
> We suggest a way to make TOAST pluggable as Storage (in a way like
> Pluggable Access Methods) - detached TOAST
> mechanics from Heap AM, and made it an independent pluggable and
> extensible part with our freshly developed TOAST API.
> With this patch set you will be able to develop and plug in your own TOAST 
> mechanics
> for table columns. Knowing internals
> and/or workflow and workload of data being TOASTed makes Custom Toasters much
> more efficient in performance and storage.
> We keep backwards compatibility and default TOAST mechanics works as it
> worked previously, working silently with any
> Toastable datatype
> (and TOASTed values and tables from previous versions, no changes in this)
> and set as default Toaster is not stated otherwise,
> but through our TOAST API.
>
> We've already presented out work at HighLoad, PgCon and PgConf
> conferences, you can find materials here
> http://www.sai.msu.su/~megera/postgres/talks/
> Testing scripts used in talks are a bit scarce and have a lot of
> manual handling, so it is another bit of work to bunch them into
> patch set, please be patient, I'll try to make it ASAP.
>
> We have ready to plug in extension Toasters
> - bytea appendable toaster for bytea datatype (impressive speedup with
> bytea append operation) is included in this patch set;
> - JSONB toaster for JSONB (very cool performance improvements when
> dealing with TOASTed JSONB) will be provided later;
> - Prototype Toasters (in development) for PostGIS (much faster then
> default with geometric data), large binary objects
> (like pg_largeobject, but much, much larger, and without existing large
> object limitations), and currently we're checking default
> Toaster implementation without using Indexes (direct access by TIDs, up
> to 3 times faster than default on smaller values,
> less storage due to absence of index tree).
>
> Patch set consists of 8 incremental patches:
> 0001_create_table_storage_v5.patch - SQL syntax fix for CREATE TABLE
> clause, processing SET STORAGE... correctly;
> This patch is already discussed in a separate thread;
>
> 0002_toaster_interface_v8.patch - TOAST API interface and SQL syntax
> allowing creation of custom Toaster (CREATE TOASTER ...)
> and setting Toaster to a table column (CREATE TABLE t (data bytea STORAGE
> EXTERNAL TOASTER bytea_toaster);)
>
> 0003_toaster_default_v7.patch - Default TOAST implemented via TOAST API;
>
> 0004_toaster_snapshot_v7.patch - refactoring of Default TOAST and support
> for versioned Toast rows;
>
> 0005_bytea_appendable_toaster_v7.patch - contrib module
> bytea_appendable_toaster - special Toaster 

Re: Handle infinite recursion in logical replication setup

2022-07-20 Thread vignesh C
On Wed, Jul 20, 2022 at 10:38 AM Peter Smith  wrote:
>
> On Tue, Jul 19, 2022 at 11:34 PM Amit Kapila  wrote:
> >
> > On Mon, Jul 18, 2022 at 9:46 PM vignesh C  wrote:
> > >
> > > I have updated the patch to handle the origin value case
> > > insensitively. The attached patch has the changes for the same.
> > >
> >
> > Thanks, the patch looks mostly good to me. I have made a few changes
> > in 0001 patch which are as follows: (a) make a comparison of origin
> > names in maybe_reread_subscription similar to slot names as in future
> > we may support origin names other than 'any' and 'none', (b) made
> > comment changes at few places and minor change in one of the error
> > message, (c) ran pgindent and slightly changed the commit message.
> >
> > I am planning to push this day after tomorrow unless there are any
> > comments/suggestions.
>
> FYI, the function name in the comment is not same as the function name here:
>
> +/*
> + * IsReservedName
> + * True iff name is either "none" or "any".
> + */
> +static bool
> +IsReservedOriginName(const char *name)

Modified. Apart from this I have run pgperltidy on the perl file and
renamed 032_origin.pl to 030_origin.pl as currently there is
029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file.
Thanks for the comment, the attached patch has the changes for the same.

Regards,
Vignesh
From 69f9b55aa28a0325e0d85c8f8bed407c7218e813 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Sat, 9 Jul 2022 13:20:34 +0530
Subject: [PATCH v36] Allow users to skip logical replication of data having
 origin.

This patch adds a new SUBSCRIPTION parameter "origin". It specifies
whether the subscription will request the publisher to only send changes
that don't have an origin or send changes regardless of origin. Setting it
to "none" means that the subscription will request the publisher to only
send changes that have no origin associated. Setting it to "any" means
that the publisher sends changes regardless of their origin. The default
is "any".
Usage:
CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port='
PUBLICATION pub1 WITH (origin = none);

This can be used to avoid loops (infinite replication of the same data)
among replication nodes.

This feature allows filtering only the replication data originating from
WAL but for initial sync (initial copy of table data) we don't have such a
facility as we can only distinguish the data based on origin from WAL. As
a follow-up patch, we are planning to forbid the initial sync if we notice
that the publication tables were also replicated from other publishers to
avoid duplicate data or loops.

Author: Vignesh C, Amit Kapila
Reviewed-By: Peter Smith, Amit Kapila, Dilip Kumar, Ashutosh Bapat, Hayato Kuroda, Shi yu, Alvaro Herrera
Discussion: https://postgr.es/m/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubh...@mail.gmail.com
---
 contrib/test_decoding/expected/replorigin.out |  10 ++
 contrib/test_decoding/sql/replorigin.sql  |   5 +
 doc/src/sgml/catalogs.sgml|  14 ++
 doc/src/sgml/ref/alter_subscription.sgml  |   5 +-
 doc/src/sgml/ref/create_subscription.sgml |  15 ++
 src/backend/catalog/pg_subscription.c |   8 +
 src/backend/catalog/system_views.sql  |   4 +-
 src/backend/commands/subscriptioncmds.c   |  43 -
 .../libpqwalreceiver/libpqwalreceiver.c   |   5 +
 src/backend/replication/logical/origin.c  |  22 ++-
 src/backend/replication/logical/worker.c  |   2 +
 src/backend/replication/pgoutput/pgoutput.c   |  27 ++-
 src/bin/pg_dump/pg_dump.c |  15 +-
 src/bin/pg_dump/pg_dump.h |   1 +
 src/bin/pg_dump/t/002_pg_dump.pl  |  22 +++
 src/bin/psql/describe.c   |   7 +-
 src/bin/psql/tab-complete.c   |   7 +-
 src/include/catalog/pg_subscription.h |  17 ++
 src/include/replication/pgoutput.h|   1 +
 src/include/replication/walreceiver.h |   2 +
 src/test/regress/expected/subscription.out| 142 +---
 src/test/regress/sql/subscription.sql |  10 ++
 src/test/subscription/t/030_origin.pl | 155 ++
 23 files changed, 462 insertions(+), 77 deletions(-)
 create mode 100644 src/test/subscription/t/030_origin.pl

diff --git a/contrib/test_decoding/expected/replorigin.out b/contrib/test_decoding/expected/replorigin.out
index 2e9ef7c823..49ffaeea2d 100644
--- a/contrib/test_decoding/expected/replorigin.out
+++ b/contrib/test_decoding/expected/replorigin.out
@@ -56,6 +56,16 @@ SELECT pg_replication_origin_drop('regress_test_decoding: temp');
 
 SELECT pg_replication_origin_drop('regress_test_decoding: temp');
 ERROR:  replication origin "regress_test_decoding: temp" does not exist
+-- specifying reserved origin names is not supported
+SELECT pg_replication_origin_create('any');
+ERROR:  replication origin name "any" is reserved
+DETAIL:  Origin names "any", "none", and names starting with "pg_" 

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-20 Thread Amit Kapila
On Thu, Jul 14, 2022 at 4:34 AM Peter Smith  wrote:
>
> On Wed, Jul 13, 2022 at 7:55 PM vignesh C  wrote:
> >
> > On Wed, Jul 13, 2022 at 1:13 PM Michael Paquier  wrote:
> > >
> > > On Wed, Jul 13, 2022 at 12:22:06PM +0530, vignesh C wrote:
> > > > Most of the code is common between GetSubscriptionRelations and
> > > > GetSubscriptionNotReadyRelations. Added a parameter to
> > > > GetSubscriptionRelations which could provide the same functionality as
> > > > the existing GetSubscriptionRelations and
> > > > GetSubscriptionNotReadyRelations. Attached patch has the changes for
> > > > the same. Thoughts?
> > >
> > > Right.  Using all_rels to mean that we'd filter relations that are not
> > > ready is a bit confusing, though.  Perhaps this could use a bitmask as
> > > argument.
> >
> > The attached v2 patch has the modified version which includes the
> > changes to make the argument as bitmask.
> >
>
> By using a bitmask I think there is an implication that the flags can
> be combined...
>
> Perhaps it is not a problem today, but later you may want more flags. e.g.
> #define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */
>
> then the bitmask idea falls apart because IIUC you have no intentions
> to permit things like:
> (SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY)
>

I think this will be an invalid combination if caller ever used it.
However, one might need to use a combination like
(SUBSCRIPTION_REL_STATE_READY | SUBSCRIPTION_REL_STATE_DONE). For such
cases, I feel the bitmask idea will be better.

-- 
With Regards,
Amit Kapila.




Re: pgsql: Default to hidden visibility for extension libraries where possi

2022-07-20 Thread Alvaro Herrera
Hello

On 2022-Jul-19, Andres Freund wrote:

> On 2022-07-19 16:28:07 +0200, Alvaro Herrera wrote:
> > Anyway, the minimal patch that makes plpgsql_check tests pass is
> > attached.
> 
> Do you want to commit that or should I?

Done.

No immediate plans for splitting plpgsql.h, so if anyone wants to take a
stab at that, be my guest.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Aprender sin pensar es inútil; pensar sin aprender, peligroso" (Confucio)




Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Amit Kapila
On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada  wrote:
>
> On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila  wrote:
> >
> > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada  
> > wrote:
> >
> > > Another idea would be to have functions, say
> > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
> > > does actual work of handling transaction commits and both
> > > SnapBuildCommitTxn() and SnapBuildCommit() call
> > > SnapBuildCommitTxnWithXInfo() with different arguments.
> > >
> >
> > Do you want to say DecodeCommit() instead of SnapBuildCommit() in
> > above para?
>
> I meant that we will call like DecodeCommit() ->
> SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals =
> true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If
> SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext()
> with has_invals = false and behaves the same as before.
>

Okay, understood. This will work.

> > Yet another idea could be to have another flag
> > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN.
> > Then, we can retrieve it even for each of the subtxn's if and when
> > required.
>
> Do you mean that when checking if the subtransaction has catalog
> changes, we check if its top-level XID has this new flag?
>

Yes.

> Why do we
> need the new flag?
>

This is required if we don't want to introduce a new set of functions
as you proposed above. I am not sure which one is better w.r.t back
patching effort later but it seems to me using flag stuff would make
future back patches easier if we make any changes in
SnapBuildCommitTxn.

-- 
With Regards,
Amit Kapila.




Re: i.e. and e.g.

2022-07-20 Thread Kyotaro Horiguchi
At Thu, 14 Jul 2022 15:38:37 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Thu, 14 Jul 2022 09:40:25 +0700, John Naylor 
>  wrote in 
> > On Wed, Jul 13, 2022 at 4:13 PM Kyotaro Horiguchi 
> > wrote:
> > >
> > > At Wed, 13 Jul 2022 18:09:43 +0900 (JST), Kyotaro Horiguchi <
> > horikyota@gmail.com> wrote in
> > > > So, "e.g." (for example) in the message sounds like "that is", which I
> > > > think is "i.e.".  It should be fixed if this is correct.  I'm not sure
> > > > whether to keep using Latin-origin acronyms like this, but in the
> > > > attached I used "i.e.".
> > 
> > I did my own quick scan and found one use of i.e. that doesn't really fit,
> > in a sentence that has other grammatical issues:
> > 
> > -Due to the differences how ECPG works compared to Informix's
> > ESQL/C (i.e., which steps
> > +Due to differences in how ECPG works compared to Informix's ESQL/C
> > (namely, which steps
> >  are purely grammar transformations and which steps rely on the
> 
> Oh!
> 
> > I've pushed that in addition to your changes, thanks!
> 
> Thanks!

By the way, I forgot about back-branches. Don't we need to fix the
same in back-branches?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Windows default locale vs initdb

2022-07-20 Thread Juan José Santamaría Flecha
On Tue, Jul 19, 2022 at 12:59 AM Thomas Munro 
wrote:

> Now that museum-grade Windows has been defenestrated, we are free to
> call GetUserDefaultLocaleName().  Here's a patch.
>

This LGTM.

>
> I think we should also convert to POSIX format when making the
> collname in your pg_import_system_collations() proposal, so that
> COLLATE "en_US" works (= a SQL identifier), but that's another
> thread[1].  I don't think we should do it in collcollate or
> datcollate, which is a string for the OS to interpret.
>

That thread has been split [1], but that is how the current version behaves.

>
> With my garbage collector hat on, I would like to rip out all of the
> support for traditional locale names, eventually.  Deleting kludgy
> code is easy and fun -- 0002 is a first swing at that -- but there
> remains an important unanswered question.  How should someone
> pg_upgrade a "English_Canada.1521" cluster if we now reject that name?
>  We'd need to do a conversion to "en-CA", or somehow tell the user to.
> H.
>

Is there a safe way to do that in pg_upgrade or would we be forcing users
to pg_dump into the new cluster?

[1]
https://www.postgresql.org/message-id/flat/0050ec23-34d9-2765-9015-98c04f0e18ac%40postgrespro.ru

Regards,

Juan José Santamaría Flecha


Re: Memory leak fix in psql

2022-07-20 Thread Japin Li

On Wed, 20 Jul 2022 at 14:21, tanghy.f...@fujitsu.com  
wrote:
> On Wednesday, July 20, 2022 12:52 PM, Michael Paquier  
> wrote:
>> What about the argument of upthread where we could use a goto in
>> functions where there are multiple pattern validation checks?  Per se
>> v4 attached.
>
> Thanks for your kindly remind and modification.
> I checked v4 patch, it looks good but I think there can be some minor 
> improvement.
> So I deleted some redundant braces around "goto error_return; ".
> Also added an error handle section in validateSQLNamePattern.
>
> Kindly to have a check at the attached v5 patch.
>
> Regards,
> Tang

Thanks for updating the patch.  It looks good.  However, it cannot be
applied on 14 stable.  The attached patches are for 10-14.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

>From 1dd828a40d5a45c0ee021f42f8eee354082a72cf Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 20 Jul 2022 12:47:52 +0900
Subject: [PATCH v5] fix the memory leak in psql describe


diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 88d92a08ae..77b0f87e39 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -112,7 +112,10 @@ describeAggregates(const char *pattern, bool verbose, bool showSystem)
 "n.nspname", "p.proname", NULL,
 "pg_catalog.pg_function_is_visible(p.oid)",
 NULL, 3))
+	{
+		termPQExpBuffer();
 		return false;
+	}
 
 	appendPQExpBufferStr(, "ORDER BY 1, 2, 4;");
 
@@ -182,7 +185,10 @@ describeAccessMethods(const char *pattern, bool verbose)
 NULL, "amname", NULL,
 NULL,
 NULL, 1))
+	{
+		termPQExpBuffer();
 		return false;
+	}
 
 	appendPQExpBufferStr(, "ORDER BY 1;");
 
@@ -244,7 +250,10 @@ describeTablespaces(const char *pattern, bool verbose)
 NULL, "spcname", NULL,
 NULL,
 NULL, 1))
+	{
+		termPQExpBuffer();
 		return false;
+	}
 
 	appendPQExpBufferStr(, "ORDER BY 1;");
 
@@ -534,7 +543,7 @@ describeFunctions(const char *functypes, const char *func_pattern,
 "n.nspname", "p.proname", NULL,
 "pg_catalog.pg_function_is_visible(p.oid)",
 NULL, 3))
-		return false;
+		goto error_return;
 
 	for (int i = 0; i < num_arg_patterns; i++)
 	{
@@ -561,7 +570,7 @@ describeFunctions(const char *functypes, const char *func_pattern,
 		true, false,
 		nspname, typname, ft, tiv,
 		NULL, 3))
-return false;
+goto error_return;
 		}
 		else
 		{
@@ -599,6 +608,10 @@ describeFunctions(const char *functypes, const char *func_pattern,
 
 	PQclear(res);
 	return true;
+
+error_return:
+	termPQExpBuffer();
+	return false;
 }
 
 
@@ -682,7 +695,10 @@ describeTypes(const char *pattern, bool verbose, bool showSystem)
 "pg_catalog.format_type(t.oid, NULL)",
 "pg_catalog.pg_type_is_visible(t.oid)",
 NULL, 3))
+	{
+		termPQExpBuffer();
 		return false;
+	}
 
 	appendPQExpBufferStr(, "ORDER BY 1, 2;");
 
@@ -836,7 +852,7 @@ describeOperators(const char *oper_pattern,
 "n.nspname", "o.oprname", NULL,
 "pg_catalog.pg_operator_is_visible(o.oid)",
 NULL, 3))
-		return false;
+		goto error_return;
 
 	if (num_arg_patterns == 1)
 		appendPQExpBufferStr(, "  AND o.oprleft = 0\n");
@@ -866,7 +882,7 @@ describeOperators(const char *oper_pattern,
 		true, false,
 		nspname, typname, ft, tiv,
 		NULL, 3))
-return false;
+goto error_return;
 		}
 		else
 		{
@@ -890,6 +906,10 @@ describeOperators(const char *oper_pattern,
 
 	PQclear(res);
 	return true;
+
+error_return:
+	termPQExpBuffer();
+	return false;
 }
 
 
@@ -953,7 +973,10 @@ listAllDbs(const char *pattern, bool verbose)
 		if (!validateSQLNamePattern(, pattern, false, false,
 	NULL, "d.datname", NULL, NULL,
 	NULL, 1))
+		{
+			termPQExpBuffer();
 			return false;
+		}
 
 	appendPQExpBufferStr(, "ORDER BY 1;");
 	res = PSQLexec(buf.data);
@@ -1106,7 +1129,10 @@ permissionsList(const char *pattern)
 "n.nspname", "c.relname", NULL,
 "n.nspname !~ '^pg_' AND pg_catalog.pg_table_is_visible(c.oid)",
 NULL, 3))
+	{
+		termPQExpBuffer();
 		return false;
+	}
 
 	appendPQExpBufferStr(, "ORDER BY 1, 2;");
 
@@ -1177,16 +1203,13 @@ listDefaultACLs(const char *pattern)
 "pg_catalog.pg_get_userbyid(d.defaclrole)",
 NULL,
 NULL, 3))
-		return false;
+		goto error_return;
 
 	appendPQExpBufferStr(, "ORDER BY 1, 2, 3;");
 
 	res = PSQLexec(buf.data);
 	if (!res)
-	{
-		termPQExpBuffer();
-		return false;
-	}
+		goto error_return;
 
 	myopt.nullPrint = NULL;
 	printfPQExpBuffer(, _("Default access privileges"));
@@ -1200,6 +1223,10 @@ listDefaultACLs(const char *pattern)
 	termPQExpBuffer();
 	PQclear(res);
 	return true;
+
+error_return:
+	termPQExpBuffer();
+	return false;
 }
 
 
@@ -1253,7 +1280,7 @@ objectDescription(const char *pattern, bool showSystem)
 false, "n.nspname", "pgc.conname", NULL,
 "pg_catalog.pg_table_is_visible(c.oid)",

Re: Fast COPY FROM based on batch insert

2022-07-20 Thread Etsuro Fujita
On Tue, Jul 19, 2022 at 6:35 PM Andrey Lepikhov
 wrote:
> On 18/7/2022 13:22, Etsuro Fujita wrote:
> > I rewrote the decision logic to something much simpler and much less
> > invasive, which reduces the patch size significantly.  Attached is an
> > updated patch.  What do you think about that?

> maybe you forgot this code:
> /*
>   * If a partition's root parent isn't allowed to use it, neither is the
>   * partition.
> */
> if (rootResultRelInfo->ri_usesMultiInsert)
> leaf_part_rri->ri_usesMultiInsert =
> ExecMultiInsertAllowed(leaf_part_rri);

I think the patch accounts for that.  Consider this bit to determine
whether to use batching for the partition chosen by
ExecFindPartition():

@@ -910,12 +962,14 @@ CopyFrom(CopyFromState cstate)

/*
 * Disable multi-inserts when the partition has BEFORE/INSTEAD
-* OF triggers, or if the partition is a foreign partition.
+* OF triggers, or if the partition is a foreign partition
+* that can't use batching.
 */
leafpart_use_multi_insert = insertMethod == CIM_MULTI_CONDITION\
AL &&
!has_before_insert_row_trig &&
!has_instead_insert_row_trig &&
-   resultRelInfo->ri_FdwRoutine == NULL;
+   (resultRelInfo->ri_FdwRoutine == NULL ||
+resultRelInfo->ri_BatchSize > 1);

If the root parent isn't allowed to use batching, then we have
insertMethod=CIM_SINGLE for the parent before we get here.  So in that
case we have leafpart_use_multi_insert=false for the chosen partition,
meaning that the partition isn't allowed to use batching, either.
(The patch just extends the existing decision logic to the
foreign-partition case.)

> Also, maybe to describe in documentation, if the value of batch_size is
> more than 1, the ExecForeignBatchInsert routine have a chance to be called?

Yeah, but I think that is the existing behavior, and that the patch
doesn't change the behavior, so I would leave that for another patch.

Thanks for reviewing!

Best regards,
Etsuro Fujita




Re: Memory leak fix in psql

2022-07-20 Thread Alvaro Herrera
More on the same tune.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php
>From 5e031fadc778365491f9e58da83a4b64f21e3bcd Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 20 Jul 2022 10:04:27 +0200
Subject: [PATCH] More goto error_return in describe.c

---
 src/bin/psql/describe.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 77b0f87e39..46999f26f2 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -1129,19 +1129,13 @@ permissionsList(const char *pattern)
 "n.nspname", "c.relname", NULL,
 "n.nspname !~ '^pg_' AND pg_catalog.pg_table_is_visible(c.oid)",
 NULL, 3))
-	{
-		termPQExpBuffer();
-		return false;
-	}
+		goto error_return;
 
 	appendPQExpBufferStr(, "ORDER BY 1, 2;");
 
 	res = PSQLexec(buf.data);
 	if (!res)
-	{
-		termPQExpBuffer();
-		return false;
-	}
+		goto error_return;
 
 	myopt.nullPrint = NULL;
 	printfPQExpBuffer(, _("Access privileges"));
@@ -1155,6 +1149,10 @@ permissionsList(const char *pattern)
 	termPQExpBuffer();
 	PQclear(res);
 	return true;
+
+error_return:
+		termPQExpBuffer();
+		return false;
 }
 
 
@@ -4983,19 +4981,13 @@ listSchemas(const char *pattern, bool verbose, bool showSystem)
 NULL, "n.nspname", NULL,
 NULL,
 NULL, 2))
-	{
-		termPQExpBuffer();
-		return false;
-	}
+		goto error_return;
 
 	appendPQExpBufferStr(, "ORDER BY 1;");
 
 	res = PSQLexec(buf.data);
 	if (!res)
-	{
-		termPQExpBuffer();
-		return false;
-	}
+		goto error_return;
 
 	myopt.nullPrint = NULL;
 	myopt.title = _("List of schemas");
@@ -5016,10 +5008,7 @@ listSchemas(const char *pattern, bool verbose, bool showSystem)
 		  pattern);
 		result = PSQLexec(buf.data);
 		if (!result)
-		{
-			termPQExpBuffer();
-			return false;
-		}
+			goto error_return;
 		else
 			pub_schema_tuples = PQntuples(result);
 
@@ -5066,6 +5055,10 @@ listSchemas(const char *pattern, bool verbose, bool showSystem)
 	}
 
 	return true;
+
+error_return:
+	termPQExpBuffer();
+	return false;
 }
 
 
-- 
2.30.2



Re: Memory leak fix in psql

2022-07-20 Thread Junwang Zhao
Thanks for your explanation, this time I know how it works, thanks ;)

On Wed, Jul 20, 2022 at 3:04 PM tanghy.f...@fujitsu.com
 wrote:
>
> > On Wed, Jul 20, 2022 at 12:51:24PM +0800, Junwang Zhao wrote:
> > > Though the patch looks good, I myself think the patch should be edited
> > > and submitted by Tang
> > > It's easy to attach a fixed patch based on the comments of the thread,
> > > but coins should be
> > > given to Tang since he is the first one to find the mem leak.
>
> Hello, Zhao
>
> Thanks for your check at this patch.
>
> I appreciate your kindly comment but there may be a misunderstanding here.
> As Michael explained, committers in Postgres will review carefully and
> help to improve contributors' patches. When the patch is finally committed
> by one committer, from what I can see, he or she will try to make sure the
> credit goes with everyone who contributed to the committed patch(such as
> bug reporter, patch author, tester, reviewer etc.).
>
> Also, developers and reviewers will try to help improving our proposed patch
> by rebasing it or adding an on-top patch(like Japin Li did in v2).
> These will make the patch better and to be committed ASAP.
>
> Good to see you at Postgres community.
>
> Regards,
> Tang



-- 
Regards
Junwang Zhao




Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Masahiko Sawada
On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila  wrote:
>
> On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada  wrote:
> >
> > On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila  
> > wrote:
> > >
> > > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada  
> > > wrote:
> > >
> > > Why do you think we can't remove
> > > ReorderBufferInitialXactsSetCatalogChanges() from the back branch
> > > patch? I think we don't need to change the existing function
> > > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper
> > > like SnapBuildXidHasCatalogChanges() similar to master branch patch.
> >
> > IIUC we need to change SnapBuildCommitTxn() but it's exposed.
> >
> > Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() ->
> > ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we
> > call like DecodeCommit() -> SnapBuildCommitTxn() ->
> > SnapBuildXidHasCatalogChanges() ->
> > ReorderBufferXidHasCatalogChanges(). In
> > SnapBuildXidHasCatalogChanges(), we need to check if the transaction
> > has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass
> > either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0)
> > down to SnapBuildXidHasCatalogChanges(). However, since
> > SnapBuildCommitTxn(), between DecodeCommit() and
> > SnapBuildXidHasCatalogChanges(), is exposed we cannot change it.
> >
>
> Agreed.
>
> > Another idea would be to have functions, say
> > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
> > does actual work of handling transaction commits and both
> > SnapBuildCommitTxn() and SnapBuildCommit() call
> > SnapBuildCommitTxnWithXInfo() with different arguments.
> >
>
> Do you want to say DecodeCommit() instead of SnapBuildCommit() in
> above para?

I meant that we will call like DecodeCommit() ->
SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals =
true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If
SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext()
with has_invals = false and behaves the same as before.

> Yet another idea could be to have another flag
> RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN.
> Then, we can retrieve it even for each of the subtxn's if and when
> required.

Do you mean that when checking if the subtransaction has catalog
changes, we check if its top-level XID has this new flag? Why do we
need the new flag?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

2022-07-20 Thread Kyotaro Horiguchi
At Wed, 20 Jul 2022 10:02:22 +0530, Bharath Rupireddy 
 wrote in 
> Done. PSA v6 patch set.

Thanks!

-Specifies the minimum size of past log file segments kept in the
+Specifies the minimum size of past WAL files kept in the

-log file by reducing the value of this parameter. On a system with
+file by reducing the value of this parameter. On a system with

Looks fine. And postgresq.conf.sample has the following lines:

#archive_library = ''   # library to use to archive a logfile segment

#archive_command = ''   # command to use to archive a logfile segment

#archive_timeout = 0# force a logfile segment switch after this

#restore_command = ''   # command to use to restore an archived logfile 
segment

Aren't they need the same fix?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Masahiko Sawada
On Wed, Jul 20, 2022 at 4:16 PM Kyotaro Horiguchi
 wrote:
>
> At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada  
> wrote in
> > On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi
> >  wrote:
> > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
> > > allocate the array with a fixed size. SnapBuildAddCommittedTxn still
> >  > assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
> > > ondisk.builder.commited.xip populated with a valid array pointer. But
> > > the patch allows committed.xip be NULL, thus in that case,
> > > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
> > > failure.
> >
> > IIUC the patch doesn't allow committed.xip to be NULL since we don't
> > overwrite it if builder->committed.xcnt is 0 (i.e.,
> > ondisk.builder.committed.xip is NULL):
>
> I meant that ondisk.builder.committed.xip can be NULL.. But looking
> again that cannot be.  I don't understand what I was looking at at
> that time.
>
> So, sorry for the noise.

No problem. Thank you for your review and comments!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Kyotaro Horiguchi
At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada  
wrote in 
> On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi
>  wrote:
> > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
> > allocate the array with a fixed size. SnapBuildAddCommittedTxn still
>  > assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
> > ondisk.builder.commited.xip populated with a valid array pointer. But
> > the patch allows committed.xip be NULL, thus in that case,
> > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
> > failure.
> 
> IIUC the patch doesn't allow committed.xip to be NULL since we don't
> overwrite it if builder->committed.xcnt is 0 (i.e.,
> ondisk.builder.committed.xip is NULL):

I meant that ondisk.builder.committed.xip can be NULL.. But looking
again that cannot be.  I don't understand what I was looking at at
that time.

So, sorry for the noise.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-07-20 Thread Marco Slot
On Mon, Jul 18, 2022 at 8:29 AM Amit Kapila  wrote:
> IIUC, this proposal is to optimize cases where users can't have a
> unique/primary key for a relation on the subscriber and those
> relations receive lots of updates or deletes?

I think this patch optimizes for all non-trivial cases of update/delete
replication (e.g. >1000 rows in the table, >1000 rows per hour updated)
without a primary key. For instance, it's quite common to have a large
append-mostly events table without a primary key (e.g. because of
partitioning, or insertion speed), which will still have occasional batch
updates/deletes.

Imagine an update of a table or partition with 1 million rows and a typical
scan speed of 1M rows/sec. An update on the whole table takes maybe 1-2
seconds. Replicating the update using a sequential scan per row can take on
the order of ~12 days ≈ 1M seconds.

The current implementation makes using REPLICA IDENTITY FULL a huge
liability/ impractical for scenarios where you want to replicate an
arbitrary set of user-defined tables, such as upgrades, migrations, shard
moves. We generally recommend users to tolerate update/delete errors in
such scenarios.

If the apply worker can use an index, the data migration tool can
tactically create one on a high cardinality column, which would practically
always be better than doing a sequential scan for non-trivial workloads.

cheers,
Marco


RE: Memory leak fix in psql

2022-07-20 Thread tanghy.f...@fujitsu.com
> On Wed, Jul 20, 2022 at 12:51:24PM +0800, Junwang Zhao wrote:
> > Though the patch looks good, I myself think the patch should be edited
> > and submitted by Tang
> > It's easy to attach a fixed patch based on the comments of the thread,
> > but coins should be
> > given to Tang since he is the first one to find the mem leak.

Hello, Zhao

Thanks for your check at this patch. 

I appreciate your kindly comment but there may be a misunderstanding here.
As Michael explained, committers in Postgres will review carefully and 
help to improve contributors' patches. When the patch is finally committed 
by one committer, from what I can see, he or she will try to make sure the 
credit goes with everyone who contributed to the committed patch(such as 
bug reporter, patch author, tester, reviewer etc.). 

Also, developers and reviewers will try to help improving our proposed patch
by rebasing it or adding an on-top patch(like Japin Li did in v2).
These will make the patch better and to be committed ASAP.

Good to see you at Postgres community.

Regards,
Tang


Re: replacing role-level NOINHERIT with a grant-level option

2022-07-20 Thread tushar

On 7/19/22 12:56 AM, Robert Haas wrote:

Another good catch. Here is v5 with a fix for that problem.

Thanks, the issue is fixed now.

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company





Re: Backup command and functions can cause assertion failure and segmentation fault

2022-07-20 Thread Michael Paquier
On Wed, Jul 20, 2022 at 02:00:00PM +0900, Fujii Masao wrote:
> I reported two trouble cases; they are the cases where BASE_BACKUP
> is canceled and terminated, respectively. But you added the test
> only for one of them. Is this intentional?

Nope.  The one I have implemented was the fanciest case among the
two, so I just focused on it.

Adding an extra test to cover the second scenario is easier.  So I
have added one as of the attached, addressing your other comments
while on it.  I have also decided to add the tests at the bottom of
001_stream_rep.pl, as these are quicker than a node initialization.
--
Michael
From 2aa841a3dfb643a14d28e6d595703f14e98ad919 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 20 Jul 2022 15:48:27 +0900
Subject: [PATCH v2] Add more TAP tests with BASE_BACKUP

---
 src/test/recovery/t/001_stream_rep.pl | 55 +++
 1 file changed, 55 insertions(+)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 86864098f9..b15dd6b29a 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -531,4 +531,59 @@ my $primary_data = $node_primary->data_dir;
 ok(!-f "$primary_data/pg_wal/$segment_removed",
 	"WAL segment $segment_removed recycled after physical slot advancing");
 
+note "testing pg_backup_start() followed by BASE_BACKUP";
+my $connstr = $node_primary->connstr('postgres') . " replication=database";
+
+# This test requires a replication connection with a database, as it mixes
+# a replication command and a SQL command.
+$node_primary->command_fails_like(
+	[
+		'psql', '-c', "SELECT pg_backup_start('backup', true)",
+		'-c',   'BASE_BACKUP', '-d', $connstr
+	],
+	qr/a backup is already in progress in this session/,
+	'BASE_BACKUP cannot run in session already running backup');
+
+note "testing BASE_BACKUP cancellation";
+
+my $sigchld_bb_timeout =
+  IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
+
+# This test requires a replication connection with a database, as it mixes
+# a replication command and a SQL command.  The first BASE_BACKUP is throttled
+# to give enough room for the cancellation running below.  The second command
+# for pg_backup_stop() should fail.
+my ($sigchld_bb_stdin, $sigchld_bb_stdout, $sigchld_bb_stderr) = ('', '', '');
+my $sigchld_bb = IPC::Run::start(
+	[
+		'psql', '-X', '-c', "BASE_BACKUP (CHECKPOINT 'fast', MAX_RATE 32);",
+		'-c',   'SELECT pg_backup_stop()',
+		'-d',   $connstr
+	],
+	'<',
+	\$sigchld_bb_stdin,
+	'>',
+	\$sigchld_bb_stdout,
+	'2>',
+	\$sigchld_bb_stderr,
+	$sigchld_bb_timeout);
+
+# The cancellation is issued once the database files are streamed and
+# the checkpoint issued at backup start completes.
+is( $node_primary->poll_query_until(
+		'postgres',
+		"SELECT pg_cancel_backend(a.pid) FROM "
+		  . "pg_stat_activity a, pg_stat_progress_basebackup b WHERE "
+		  . "a.pid = b.pid AND a.query ~ 'BASE_BACKUP' AND "
+		  . "b.phase = 'streaming database files';"),
+	"1",
+	"WAL sender sending base backup killed");
+
+# The psql command should fail on pg_backup_stop().
+ok( pump_until(
+		$sigchld_bb, $sigchld_bb_timeout,
+		\$sigchld_bb_stderr, qr/backup is not in progress/),
+	'base backup cleanly cancelled');
+$sigchld_bb->finish();
+
 done_testing();
-- 
2.36.1



signature.asc
Description: PGP signature


Re: Allow placeholders in ALTER ROLE w/o superuser

2022-07-20 Thread Kyotaro Horiguchi
At Tue, 19 Jul 2022 09:53:39 -0700, Nathan Bossart  
wrote in 
> On Tue, Jul 19, 2022 at 12:55:14AM -0500, Steve Chavez wrote:
> > Taking your options into consideration, for me the correct behaviour should
> > be:
> > 
> > - The ALTER ROLE placeholder should always be stored with a PGC_USERSET
> > GucContext. It's a placeholder anyway, so it should be the less restrictive
> > one. If the user wants to define it as PGC_SUSET or other this should be
> > done through a custom extension.
> > - When an extension claims the placeholder, we should check the
> > DefineCustomXXXVariable GucContext with PGC_USERSET. If there's a match,
> > then the value gets applied, otherwise WARN or ERR.
> >   The role GUCs get applied at login time right? So at this point we can
> > WARN or ERR about the defined role GUCs.
> > 
> > What do you think?
> 
> Hm.  I would expect ALTER ROLE to store the PGC_SUSET context when executed
> by a superuser or a role with privileges via pg_parameter_acl.  Storing all
> placeholder GUC settings as PGC_USERSET would make things more restrictive
> than they are today.  For example, it would no longer be possible to apply
> any ALTER ROLE settings from superusers for placeholders that later become
> custom GUCS.

Currently placehoders are always created PGC_USERSET, thus
non-superuser can set it.  But if loaded module defines the custom
variable as PGC_SUSET, the value set by the user is refused then the
value from ALTER-ROLE-SET or otherwise the default value from
DefineCustom*Variable is used. If the module defines it as
PGC_USERSET, the last value is accepted.

If a placehoders were created PGC_SUSET, non-superusers cannot set it
on-session. But that behavior is not needed since loadable modules
reject PGC_USERSET values as above.


Returning to the topic, that operation can be allowed in PG15, having
being granted by superuser using the GRANT SET ON PARMETER command.

=# GRANT SET ON PARAMETER my.username TO r1;

r1=> ALTER ROLE r1 SET my.username = 'hoge_user_x';

r1=> \c
r1=> => show my.username;
 my.username 
-
 hoge_user_x
(1 row)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




RE: Memory leak fix in psql

2022-07-20 Thread tanghy.f...@fujitsu.com
On Wednesday, July 20, 2022 12:52 PM, Michael Paquier  
wrote:
> What about the argument of upthread where we could use a goto in
> functions where there are multiple pattern validation checks?  Per se
> v4 attached.

Thanks for your kindly remind and modification.
I checked v4 patch, it looks good but I think there can be some minor 
improvement.
So I deleted some redundant braces around "goto error_return; ".
Also added an error handle section in validateSQLNamePattern.

Kindly to have a check at the attached v5 patch.

Regards,
Tang


v5-0001-fix-the-memory-leak-in-psql-describe.patch
Description: v5-0001-fix-the-memory-leak-in-psql-describe.patch


  1   2   >