Re: Remove useless arguments in ReadCheckpointRecord().
I agree to removing the two parameters. And agree to let ReadCheckpointRecord not conscious of the location source. At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao wrote in > Agreed. Attached is the updated version of the patch. > Thanks for the review! - (errmsg("could not locate required checkpoint record"), + (errmsg("could not locate a valid checkpoint record in backup_label file"), "in backup_label" there looks *to me* need some verb.. By the way, this looks like a good chance to remove the (now) extra parens around errmsg() and friends. For example: - (errmsg("could not locate a valid checkpoint record in backup_label file"), + errmsg("could not locate checkpoint record spcified in backup_label file"), - (errmsg("could not locate a valid checkpoint record in control file"))); + errmsg("could not locate checkpoint record recorded in control file"))); + (errmsg("invalid checkpoint record"))); Is it useful to show the specified LSN there? + (errmsg("invalid resource manager ID in checkpoint record"))); We have a similar message "invalid resource manager ID %u at %X/%X". Since the caller explains that it is a checkpoint record, we can share the message here. + (errmsg("invalid xl_info in checkpoint record"))); (It is not an issue of this patch, though) I don't think this is appropriate for user-facing message. Counldn't we say "unexpected record type: %x" or something like? + (errmsg("invalid length of checkpoint record"))); We have "record with invalid length at %X/%X" or "invalid record length at %X/%X: wanted %u, got %u". Counld we reuse any of them? > > May be unrelated, IIRC, for the errors like ereport(PANIC, > > (errmsg("could not locate a valid checkpoint record"))); we wanted to > > add a hint asking users to consider running pg_resetwal to fix the > > issue. The error for ReadCheckpointRecord() in backup_label file > > cases, already gives a hint errhint("If you are restoring from a > > backup, touch \"%s/recovery.signal\" .. Adding the hint of running > > pg_resetwal (of course with a caution that it can cause inconsistency > > in the data and use it as a last resort as described in the docs) > > helps users and support engineers a lot to mitigate the server down > > cases quickly. > > +1 to add some hint messages. But I'm not sure if it's good to hint > the use of pg_resetwal because, as you're saying, it should be used as > a last resort and has some big risks like data loss, corruption, > etc. That is, I'm concerned about that some users might quickly run > pg_resetwal because hint message says that, without reading the docs > nor understanding those risks. I don't think we want to recommend pg_resetwal to those who don't reach it by themselves by other means. I know of a few instances of people who had the tool (unrecoverably) break their own clusters. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: i.e. and e.g.
On Thu, Jul 21, 2022 at 11:22 AM Kyotaro Horiguchi wrote: > > At Thu, 21 Jul 2022 10:20:30 +0700, John Naylor < john.nay...@enterprisedb.com> wrote in > > On Wed, Jul 20, 2022 at 3:40 PM Kyotaro Horiguchi < horikyota@gmail.com> > > wrote: > > > By the way, I forgot about back-branches. Don't we need to fix the > > > same in back-branches? > > > > The intent of the messages is pretty clear to me, so I don't really see a > > need to change back branches. It does make sense for v15, though, and I > > just forgot to consider that. > > Ok, I'm fine with that. Thanks! This is done. -- John Naylor EDB: http://www.enterprisedb.com
Re: Add LZ4 compression in pg_dump
On Tue, Jul 05, 2022 at 01:22:47PM +, gkokola...@pm.me wrote: > I have updated for "some" of the comments. This is not an unwillingness to > incorporate those specific comments. Simply this patchset had started to > divert > heavily already based on comments from Mr. Paquier who had already requested > for > the APIs to be refactored to use function pointers. This is happening in 0002 > of > the patchset. 0001 of the patchset is using the new compression.h under > common. > > This patchset should be considered a late draft, as commentary, documentation, > and some finer details are not yet finalized; because I am expecting the > proposed > refactor to receive a wealth of comments. It would be helpful to understand if > the proposed direction is something worth to be worked upon, before moving to > the > finer details. I have read through the patch set, and I like a lot the separation you are doing here with CompressFileHandle where a compression method has to specify a full set of callbacks depending on the actions that need to be taken. One advantage, as you patch shows, is that you reduce the dependency of each code path depending on the compression method, with #ifdefs and such located mostly into their own file structure, so as adding a new compression method becomes really easier. These callbacks are going to require much more documentation to describe what anybody using them should expect from them, and perhaps they could be renamed in a more generic way as the currect names come from POSIX (say read_char(), read_string()?), even if this patch has just inherited the names coming from pg_dump itself, but this can be tuned over and over. The split into three parts as of 0001 to plug into pg_dump the new compression option set, 0002 to introduce the callbacks and 0003 to add LZ4, building on the two first parts, makes sense to me. 0001 and 0002 could be done in a reversed order as they are mostly independent, this order is fine as-is. In short, I am fine with the proposed approach. +#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compressionMethod +* in header */ Indeed, the dump format needs a version bump for this information. +static bool +parse_compression_option(const char *opt, +pg_compress_specification *compress_spec) This parsing logic in pg_dump.c looks a lot like what pg_receivewal.c does with its parse_compress_options() where, for compatibility: - If only a number is given: -- Assume no compression if level is 0. -- Assume gzip with given compression if level > 0. - If a string is found, assume a full spec, with optionally a level. So some consolidation could be done between both. By the way, I can see that GZCLOSE(), etc. are still defined in compress_io.h but they are not used. -- Michael signature.asc Description: PGP signature
Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.
On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote: > Yeah, it is not very clear to me either. I think this won't be > difficult to change one or another way depending on future needs. At > this stage, it appeared to me that bitmask is a better way to > represent this information but if you and other feels using enum is a > better idea then I am fine with that as well. Please don't get me wrong :) I favor a bitmask over an enum here, as you do, with a clean layer for those flags. -- Michael signature.asc Description: PGP signature
Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.
On Thu, Jul 21, 2022 at 7:10 AM Michael Paquier wrote: > > On Wed, Jul 20, 2022 at 02:32:44PM +0530, Amit Kapila wrote: > > On Thu, Jul 14, 2022 at 4:34 AM Peter Smith wrote: > >> By using a bitmask I think there is an implication that the flags can > >> be combined... > >> > >> Perhaps it is not a problem today, but later you may want more flags. e.g. > >> #define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */ > >> > >> then the bitmask idea falls apart because IIUC you have no intentions > >> to permit things like: > >> (SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY) > > > > I think this will be an invalid combination if caller ever used it. > > However, one might need to use a combination like > > (SUBSCRIPTION_REL_STATE_READY | SUBSCRIPTION_REL_STATE_DONE). For such > > cases, I feel the bitmask idea will be better. > > It feels unnatural to me to have a flag saying "not-ready" and one > saying "ready", while we could have a flag saying "ready" that can be > combined with a second flag to decide if the contents of srsubstate > should be matched or *not* matched with the states expected by the > caller. This could be extended to more state values, for example. > > I am not sure if we actually need this much as I have no idea if > future features would use it, so please take my suggestion lightly :) > Yeah, it is not very clear to me either. I think this won't be difficult to change one or another way depending on future needs. At this stage, it appeared to me that bitmask is a better way to represent this information but if you and other feels using enum is a better idea then I am fine with that as well. -- With Regards, Amit Kapila.
Re: making relfilenodes 56 bits
On Wed, Jul 20, 2022 at 11:27 PM Dilip Kumar wrote: > [v10 patch set] Hi Dilip, I'm experimenting with these patches and will hopefully have more to say soon, but I just wanted to point out that this builds with warnings and failed on 3/4 of the CI OSes on cfbot's last run. Maybe there is the good kind of uninitialised data on Linux, and the bad kind of uninitialised data on those other pesky systems?
Re: i.e. and e.g.
At Thu, 21 Jul 2022 10:20:30 +0700, John Naylor wrote in > On Wed, Jul 20, 2022 at 3:40 PM Kyotaro Horiguchi > wrote: > > By the way, I forgot about back-branches. Don't we need to fix the > > same in back-branches? > > The intent of the messages is pretty clear to me, so I don't really see a > need to change back branches. It does make sense for v15, though, and I > just forgot to consider that. Ok, I'm fine with that. Thanks! regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages
At Wed, 20 Jul 2022 17:25:33 +0530, Bharath Rupireddy wrote in > On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi > wrote: > PSA v7 patch set. Thanks. Looks perfect, but (sorry..) in the final checking, I found "log archive" in the doc. If you agree to it please merge the attached (or refined one) and I'd call it a day. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index c0b89a3c01..e5344eb277 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -2659,7 +2659,7 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;" If set to true, the backup will wait until the last required WAL - segment has been archived, or emit a warning if log archiving is + segment has been archived, or emit a warning if WAL archiving is not enabled. If false, the backup will neither wait nor warn, leaving the client responsible for ensuring the required log is available. The default is true. diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index 56ac7b754b..e50f00afa8 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -318,7 +318,7 @@ PostgreSQL documentation backup. This will include all write-ahead logs generated during the backup. Unless the method none is specified, it is possible to start a postmaster in the target -directory without the need to consult the log archive, thus +directory without the need to consult the WAL archive, thus making the output a completely standalone backup.
Re: GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.
Thanks Aleksander and Álvaro for your inputs. I understand this change is not making any improvement to the current code. I was a bit concerned regarding the design and consistency of the function that exists for the server in recovery and for the server that is not in recovery. I was trying to write following snippet where I am interested only for XLogRecPtr: recPtr = RecoveryInProgress() ? GetStandbyFlushRecPtr(NULL) : GetFlushRecPtr(NULL); But I can't write this since I have to pass an argument for GetStandbyFlushRecPtr() but that is not compulsory for GetFlushRecPtr(). I agree to reject proposed changes since that is not useful immediately and have no rule for the optional argument for the similar-looking function. Regards, Amul
Re: Custom tuplesorts for extensions
On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov wrote: > There are some places, which potentially could cause a slowdown. I'm > going to make some experiments with that. I haven't looked at the patches, so I don't know of a specific place to look for a slowdown, but I thought it might help to perform the same query tests as my most recent test for evaluating qsort variants (some description in [1]), and here is the spreadsheet. Overall, the differences look like noise. A few cases with unabbreviatable text look a bit faster with the patch. I'm not sure if that's a real difference, but in any case I don't see a slowdown anywhere. [1] https://www.postgresql.org/message-id/CAFBsxsHeTACMP1JVQ%2Bm35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ%40mail.gmail.com -- John Naylor EDB: http://www.enterprisedb.com master-vs-custom-ext-v2.ods Description: application/vnd.oasis.opendocument.spreadsheet
Re: i.e. and e.g.
On Wed, Jul 20, 2022 at 3:40 PM Kyotaro Horiguchi wrote: > By the way, I forgot about back-branches. Don't we need to fix the > same in back-branches? The intent of the messages is pretty clear to me, so I don't really see a need to change back branches. It does make sense for v15, though, and I just forgot to consider that. -- John Naylor EDB: http://www.enterprisedb.com
Re: Remove useless arguments in ReadCheckpointRecord().
On 2022/07/21 0:29, Bharath Rupireddy wrote: How about we transform the following messages into something like below? (errmsg("could not locate a valid checkpoint record"))); after ReadCheckpointRecord() for control file cases to "could not locate valid checkpoint record in control file" (errmsg("could not locate required checkpoint record"), after ReadCheckpointRecord() for backup_label case to "could not locate valid checkpoint record in backup_label file" The above messages give more meaningful and unique info to the users. Agreed. Attached is the updated version of the patch. Thanks for the review! May be unrelated, IIRC, for the errors like ereport(PANIC, (errmsg("could not locate a valid checkpoint record"))); we wanted to add a hint asking users to consider running pg_resetwal to fix the issue. The error for ReadCheckpointRecord() in backup_label file cases, already gives a hint errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" .. Adding the hint of running pg_resetwal (of course with a caution that it can cause inconsistency in the data and use it as a last resort as described in the docs) helps users and support engineers a lot to mitigate the server down cases quickly. +1 to add some hint messages. But I'm not sure if it's good to hint the use of pg_resetwal because, as you're saying, it should be used as a last resort and has some big risks like data loss, corruption, etc. That is, I'm concerned about that some users might quickly run pg_resetwal because hint message says that, without reading the docs nor understanding those risks. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATIONFrom a5e45acb753e9d93074a25a2fc409c693c46630c Mon Sep 17 00:00:00 2001 From: Fujii Masao Date: Wed, 20 Jul 2022 23:06:44 +0900 Subject: [PATCH] Remove useless arguments in ReadCheckpointRecord(). --- src/backend/access/transam/xlogrecovery.c | 88 +-- 1 file changed, 17 insertions(+), 71 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 5d6f1b5e46..010f0cd7b2 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -422,8 +422,8 @@ static XLogPageReadResult WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, XLogRecPtr replayLSN, bool nonblocking); static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr); -static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, XLogRecPtr RecPtr, - int whichChkpt, bool report, TimeLineID replayTLI); +static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, + XLogRecPtr RecPtr, TimeLineID replayTLI); static bool rescanLatestTimeLine(TimeLineID replayTLI, XLogRecPtr replayLSN); static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli, XLogSource source, bool notfoundOk); @@ -605,7 +605,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, * When a backup_label file is present, we want to roll forward from * the checkpoint it identifies, rather than using pg_control. */ - record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 0, true, + record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, CheckPointTLI); if (record != NULL) { @@ -638,7 +638,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, else { ereport(FATAL, - (errmsg("could not locate required checkpoint record"), + (errmsg("could not locate a valid checkpoint record in backup_label file"), errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery options.\n" "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n" "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.", @@ -744,7 +744,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
Moving the report from security to -hackers on Noah's advice. Since the function(s) involved in the crash are not present in any of the released versions, it is not considered a security issue. I can confirm that this is reproducible on the latest commit on master, 3c0bcdbc66. Below is the original analysis, followed by Noah's analysis. To be able to reproduce it, please note that perl support is required; hence `./configure --with-perl`. The note about 'security concerns around on_plperl_init parameter', below, refers to now-fixed issue, at commit 13d8388151. Best regards, Gurjeet http://Gurje.et Forwarded Conversation Subject: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS From: Gurjeet Singh Date: Mon, Jul 4, 2022 at 10:24 AM To: Postgres Security Cc: Bossart, Nathan While poking at plperl's GUC in an internal discussion, I was able to induce a crash (or an assertion failure in assert-enabled builds) as an unprivileged user. My investigation so far has revealed that the code path for the following condition has never been tested, and because of this, when a user tries to override an SUSET param via PGOPTIONS, Postgres tries to perform a table lookup during process initialization. Because there's no transaction in progress, and because this table is not in the primed caches, we end up with code trying to dereference an uninitialized CurrentResourceOwner. The condition: User specifies PGOPTIONS"-c custom.param" "custom.param" is used by an extension which is specified in session_preload_libraries The extension uses DefineCustom*Variable("custom.param", PGC_SUSET) inside set_config_option() record->context == PGC_SUSET context == PGC_BACKEND calls pg_parameter_aclcheck() -> eventually leads to assertion-failure (or crash, when assertions are disabled) See below for 1. How to reproduce, 2. Assertion failure stack, and 3. Crash stack When the user does not specify PGOPTIONS, the code in define_custom_variable() returns prematurely, after a failed bsearch() lookup, and hence avoids this bug. I think similar crash can be induced when the custom parameter is of kind PGC_SU_BACKEND, because the code to handle that also invokes pg_parameter_aclcheck(). Also, I believe the same condition would arise if the extension is specified local_preload_libraries. I haven't been able to think of an attack vector using this bug, but it can be used to cause a denial-of-service by an authenticated user. I'm sending this report to security list, instead of -hackers, out of abundance of caution; please feel free to move it to -hackers, if it's not considered a security concern. I discovered this bug a couple of days ago, just before leaving on a trip. But because of shortage of time over the weekend, I haven't been able to dig deeper into it. Since I don't think I'll be able to spend any significant time on it for at least another couple of days, I'm sending this report without a patch or a proposed fix. CC: Nathan, whose security concerns around on_plperl_init parameter lead to this discovery. [1]: How to reproduce $ psql -c 'create user test' CREATE ROLE $ psql -c "alter system set session_preload_libraries='plperl'" ALTER SYSTEM $ # restart server $ psql -c 'show session_preload_libraries' session_preload_libraries --- plperl (1 row) $ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test psql: error: connection to server on socket "/tmp/.s.psql.5432" failed: server closed the connection unexpectedly ┆ This probably means the server terminated abnormally before or while processing the request. [2]: Assertion failure stack LOG: database system is ready to accept connections TRAP: FailedAssertion("IsTransactionState()", File: "../../../../../../POSTGRES/src/backend/utils/cache/catcache.c", Line: 1209, P ID: 199868) postgres: test postgres [local] startup(ExceptionalCondition+0xd0)[0x55e503a4e6c9] postgres: test postgres [local] startup(+0x7e069b)[0x55e503a2a69b] postgres: test postgres [local] startup(SearchCatCache1+0x3a)[0x55e503a2a56b] postgres: test postgres [local] startup(SearchSysCache1+0xc1)[0x55e503a46fe4] postgres: test postgres [local] startup(pg_parameter_aclmask+0x6f)[0x55e50345f098] postgres: test postgres [local] startup(pg_parameter_aclcheck+0x2d)[0x55e50346039c] postgres: test postgres [local] startup(set_config_option+0x450)[0x55e503a70727] postgres: test postgres [local] startup(+0x829ce8)[0x55e503a73ce8] postgres: test postgres [local] startup(DefineCustomStringVariable+0xa4)[0x55e503a74306] /home/vagrant/dev/POSTGRES_builds/add_tz_param/db/lib/postgresql/plperl.so(_PG_init+0xd7)[0x7fed3d845425] postgres: test postgres [local] startup(+0x80cc50)[0x55e503a56c50] postgres: test postgres [local] startup(load_file+0x43)[0x55e503a566d9] postgres: test postgres [local] startup(+0x81ba89)[0x55e503a65a89] postgres: test postgres [local]
Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?
On Thu, Jul 21, 2022 at 3:47 AM David Rowley wrote: > On Wed, 20 Jul 2022 at 21:19, Richard Guo wrote: > > So the idea is if the ECs used by the mergeclauses are prefix of the > > query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why > > not relax this further that if the ECs in the mergeclauses and the > > query_pathkeys have common prefix, we use that prefix as pathkeys? So > > that we can have a plan like below: > > I don't think that's a clear-cut win. There is scoring code in there > to try to arrange the pathkey list in the order of > most-useful-to-upper-level-joins firsts. If we were to do as you > describe we could end up generating worse plans when there is some > subsequent Merge Join above this one that has join conditions that the > query_pathkeys are not compatible with. Yeah, you're right. Although we would try different permutation of the pathkeys in sort_inner_and_outer() but that does not cover every possible ordering due to cost consideration. So we still need to respect the heuristics behind the pathkey order returned by this function, which is the scoring logic trying to list most-useful-to-upper-level-joins keys earlier. > Maybe your idea could be made to work in cases where > bms_equal(joinrel->relids, root->all_baserels). In that case, we > should not be processing any further joins and don't need to consider > that as a factor for the scoring. That should work, as long as this case is common enough to worth we writing the codes. Thanks Richard
Re: Improve description of XLOG_RUNNING_XACTS
On 2022/07/21 10:13, Masahiko Sawada wrote: Hi, I realized that standby_desc_running_xacts() in standbydesc.c doesn't describe subtransaction XIDs. I've attached the patch to improve the description. +1 The patch looks good to me. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Allowing REINDEX to have an optional name
On Wed, Jul 20, 2022 at 07:27:07AM -0500, Justin Pryzby wrote: > Looks fine Thanks for checking, applied. -- Michael signature.asc Description: PGP signature
Re: Memory leak fix in psql
On Thu, Jul 21, 2022 at 09:10:43AM +0800, Japin Li wrote: > Yeah, we should take care of the backpatch risk. However, I think > it makes sense to backpatch. We are talking about 256 bytes being leaked in each loop when a validation pattern or when a query fails, so I don't see a strong argument in manipulating 10~14 more than necessary for this amount of memory. The contents of describe.c are the same for v15 though, and we are still in beta on REL_15_STABLE, so I have applied the patch down to v15, adding what Alvaro has sent on top of the rest. -- Michael signature.asc Description: PGP signature
Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.
On Wed, Jul 20, 2022 at 02:32:44PM +0530, Amit Kapila wrote: > On Thu, Jul 14, 2022 at 4:34 AM Peter Smith wrote: >> By using a bitmask I think there is an implication that the flags can >> be combined... >> >> Perhaps it is not a problem today, but later you may want more flags. e.g. >> #define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */ >> >> then the bitmask idea falls apart because IIUC you have no intentions >> to permit things like: >> (SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY) > > I think this will be an invalid combination if caller ever used it. > However, one might need to use a combination like > (SUBSCRIPTION_REL_STATE_READY | SUBSCRIPTION_REL_STATE_DONE). For such > cases, I feel the bitmask idea will be better. It feels unnatural to me to have a flag saying "not-ready" and one saying "ready", while we could have a flag saying "ready" that can be combined with a second flag to decide if the contents of srsubstate should be matched or *not* matched with the states expected by the caller. This could be extended to more state values, for example. I am not sure if we actually need this much as I have no idea if future features would use it, so please take my suggestion lightly :) -- Michael signature.asc Description: PGP signature
Re: System catalog documentation chapter
On Wed, Jul 20, 2022 at 04:32:46PM -0400, Bruce Momjian wrote: > On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote: > > Will there be a comprehensive list somewhere? Even if it just lists the > > views, > > gives maybe a one-sentence description, and links to the relevant chapter, I > > would find that helpful sometimes. > > I was not planning on that since we don't do that in any other cases I > can think of. > > > I ask because I occasionally find myself wanting a comprehensive list of > > functions, and as far as I can tell it doesn't exist. I'm hoping to avoid > > that > > situation for views. > > Well, then we just leave them where the are and link to them from other > parts of the documentation, which I assume/hope we already do. People have mentioned the view documentation doesn't belong in the Internals part. Maybe we can just move it to the Server Administration part and leave it together. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Improve description of XLOG_RUNNING_XACTS
Hi, I realized that standby_desc_running_xacts() in standbydesc.c doesn't describe subtransaction XIDs. I've attached the patch to improve the description. Here is an example by pg_wlaldump: * HEAD rmgr: Standby len (rec/tot): 58/58, tx: 0, lsn: 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050 latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048 * w/ patch rmgr: Standby len (rec/tot): 58/58, tx: 0, lsn: 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050 latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048; 1 subxacts: 1049 Please review it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ 0001-Improve-description-of-XLOG_RUNNING_XACTS.patch Description: Binary data
Re: Memory leak fix in psql
On Wed, 20 Jul 2022 at 21:54, Tom Lane wrote: > Japin Li writes: >> Thanks for updating the patch. It looks good. However, it cannot be >> applied on 14 stable. The attached patches are for 10-14. > > While I think this is good cleanup, I'm doubtful that any of these > leaks are probable enough to be worth back-patching into stable > branches. The risk of breaking something should not be neglected. > Yeah, we should take care of the backpatch risk. However, I think it makes sense to backpatch. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Memory leak fix in psql
On Wed, Jul 20, 2022 at 10:05:47AM +0200, Alvaro Herrera wrote: > More on the same tune. Thanks. I have noticed that as well. I'll include all that in the set. -- Michael signature.asc Description: PGP signature
Re: [PATCH] Log details for client certificate failures
Jacob Champion writes: > I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s, > because that seems to be a common case for the check hooks. Really? That's almost certainly NOT okay. As an example, if you have a problem with a new value loaded from postgresql.conf during SIGHUP processing, throwing ERROR will cause the postmaster to exit. I wouldn't be too surprised if there are isolated cases where people didn't understand what they were doing and wrote that, but that needs to be fixed not emulated. regards, tom lane
Re: [PATCH] Log details for client certificate failures
On Wed, Jul 20, 2022 at 3:15 PM Tom Lane wrote: > guc_malloc's behavior varies depending on elevel. It's *not* > equivalent to palloc. Right, sorry -- a better way for me to ask the question: I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s, because that seems to be a common case for the check hooks. If that's okay, is there any reason not to use palloc() semantics for pg_clean_ascii()? (And if it's not okay, why?) --Jacob
Re: [PATCH] Log details for client certificate failures
Jacob Champion writes: > The guc_strdup() approach really reduces the amount of code, so that's > what I did in v3. I'm not following why we need to return NULL on > failure, though -- both palloc() and guc_malloc() ERROR on failure, so > is it okay to keep those semantics the same? guc_malloc's behavior varies depending on elevel. It's *not* equivalent to palloc. regards, tom lane
Re: [PATCH] Log details for client certificate failures
On Tue, Jul 19, 2022 at 3:38 PM Andres Freund wrote: > Or alternatively, perhaps we can just make pg_clean_ascii() return NULL > if allocation failed and then guc_strdup() the result in guc.c? The guc_strdup() approach really reduces the amount of code, so that's what I did in v3. I'm not following why we need to return NULL on failure, though -- both palloc() and guc_malloc() ERROR on failure, so is it okay to keep those semantics the same? > If we end up needing a two phase approach, why use the same function for > both phases? That seems quite awkward. Mostly so the byte counting always agrees between the two phases, no matter how the implementation evolves. But it's hopefully moot now. --Jacob From dfd76f4dbcf3834371442d593d315797762bbd11 Mon Sep 17 00:00:00 2001 From: Jacob Champion Date: Tue, 19 Jul 2022 10:48:58 -0700 Subject: [PATCH v3 1/2] pg_clean_ascii(): escape bytes rather than lose them Rather than replace each unprintable byte with a '?' character, replace it with a hex escape instead. The API now allocates a copy rather than modifying the input in place. --- src/backend/postmaster/postmaster.c | 6 +- src/backend/utils/misc/guc.c| 4 ++-- src/common/string.c | 26 +- src/include/common/string.h | 2 +- 4 files changed, 21 insertions(+), 17 deletions(-) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 1c25457526..5e8cd770c0 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2284,11 +2284,7 @@ retry1: */ if (strcmp(nameptr, "application_name") == 0) { - char *tmp_app_name = pstrdup(valptr); - - pg_clean_ascii(tmp_app_name); - - port->application_name = tmp_app_name; + port->application_name = pg_clean_ascii(valptr); } } offset = valoffset + strlen(valptr) + 1; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 0328029d43..60400752e5 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -12480,7 +12480,7 @@ static bool check_application_name(char **newval, void **extra, GucSource source) { /* Only allow clean ASCII chars in the application name */ - pg_clean_ascii(*newval); + *newval = guc_strdup(ERROR, pg_clean_ascii(*newval)); return true; } @@ -12496,7 +12496,7 @@ static bool check_cluster_name(char **newval, void **extra, GucSource source) { /* Only allow clean ASCII chars in the cluster name */ - pg_clean_ascii(*newval); + *newval = guc_strdup(ERROR, pg_clean_ascii(*newval)); return true; } diff --git a/src/common/string.c b/src/common/string.c index 16940d1fa7..db15324c62 100644 --- a/src/common/string.c +++ b/src/common/string.c @@ -22,6 +22,7 @@ #endif #include "common/string.h" +#include "lib/stringinfo.h" /* @@ -59,9 +60,9 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base) /* - * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char + * pg_clean_ascii -- Replace any non-ASCII chars with a "\xXX" string * - * Modifies the string passed in which must be '\0'-terminated. + * Makes a palloc'd copy of the string passed in, which must be '\0'-terminated. * * This function exists specifically to deal with filtering out * non-ASCII characters in a few places where the client can provide an almost @@ -73,22 +74,29 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base) * In general, this function should NOT be used- instead, consider how to handle * the string without needing to filter out the non-ASCII characters. * - * Ultimately, we'd like to improve the situation to not require stripping out - * all non-ASCII but perform more intelligent filtering which would allow UTF or + * Ultimately, we'd like to improve the situation to not require replacing all + * non-ASCII but perform more intelligent filtering which would allow UTF or * similar, but it's unclear exactly what we should allow, so stick to ASCII only * for now. */ -void -pg_clean_ascii(char *str) +char * +pg_clean_ascii(const char *str) { - /* Only allow clean ASCII chars in the string */ - char *p; + StringInfoData buf; + const char *p; + + initStringInfo(); for (p = str; *p != '\0'; p++) { + /* Only allow clean ASCII chars in the string */ if (*p < 32 || *p > 126) - *p = '?'; + appendStringInfo(, "\\x%02x", (unsigned char) *p); + else + appendStringInfoChar(, *p); } + + return buf.data; } diff --git a/src/include/common/string.h b/src/include/common/string.h index cf00fb53cd..d10d0c9cbf 100644 --- a/src/include/common/string.h +++ b/src/include/common/string.h @@ -24,7 +24,7 @@ typedef struct PromptInterruptContext extern bool pg_str_endswith(const char *str, const char *end); extern int strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base); -extern void pg_clean_ascii(char *str); +extern char
Re: System catalog documentation chapter
On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote: > On Wed, 20 Jul 2022 at 16:08, Bruce Momjian wrote: > > On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote: > > I am going to look at moving system views that make sense into the > > chapters where their contents are mentioned. I don't think having a > > central list of views is really helping us because we expect the views > > to be used in ways the system catalogs would not be. > > I have grouped the views by topic. What I would like to do next is to > move these view sections to the end of relevant documentation chapters. > Is that going to be an improvement? > > > Will there be a comprehensive list somewhere? Even if it just lists the views, > gives maybe a one-sentence description, and links to the relevant chapter, I > would find that helpful sometimes. I was not planning on that since we don't do that in any other cases I can think of. > I ask because I occasionally find myself wanting a comprehensive list of > functions, and as far as I can tell it doesn't exist. I'm hoping to avoid that > situation for views. Well, then we just leave them where the are and link to them from other parts of the documentation, which I assume/hope we already do. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: System catalog documentation chapter
On Wed, 20 Jul 2022 at 16:08, Bruce Momjian wrote: > On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote: > > I am going to look at moving system views that make sense into the > > chapters where their contents are mentioned. I don't think having a > > central list of views is really helping us because we expect the views > > to be used in ways the system catalogs would not be. > > I have grouped the views by topic. What I would like to do next is to > move these view sections to the end of relevant documentation chapters. > Is that going to be an improvement? Will there be a comprehensive list somewhere? Even if it just lists the views, gives maybe a one-sentence description, and links to the relevant chapter, I would find that helpful sometimes. I ask because I occasionally find myself wanting a comprehensive list of functions, and as far as I can tell it doesn't exist. I'm hoping to avoid that situation for views.
Re: System catalog documentation chapter
On Tue, Jul 19, 2022 at 01:41:44PM -0400, Bruce Momjian wrote: > I am going to look at moving system views that make sense into the > chapters where their contents are mentioned. I don't think having a > central list of views is really helping us because we expect the views > to be used in ways the system catalogs would not be. I have grouped the views by topic. What I would like to do next is to move these view sections to the end of relevant documentation chapters. Is that going to be an improvement? --- pg_available_extensions pg_available_extension_versions pg_backend_memory_contexts pg_config pg_cursors pg_file_settings pg_hba_file_rules pg_ident_file_mappings pg_settings pg_locks pg_policies pg_prepared_statements pg_prepared_xacts pg_publication_tables pg_replication_origin_status pg_replication_slots pg_group pg_roles pg_shadow pg_user pg_user_mappings pg_shmem_allocations pg_stats pg_stats_ext pg_stats_ext_exprs pg_timezone_abbrevs pg_timezone_names pg_indexes pg_matviews pg_rules pg_seclabels pg_sequences pg_tables pg_views -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?
On Wed, 20 Jul 2022 at 21:19, Richard Guo wrote: > So the idea is if the ECs used by the mergeclauses are prefix of the > query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why > not relax this further that if the ECs in the mergeclauses and the > query_pathkeys have common prefix, we use that prefix as pathkeys? So > that we can have a plan like below: I don't think that's a clear-cut win. There is scoring code in there to try to arrange the pathkey list in the order of most-useful-to-upper-level-joins firsts. If we were to do as you describe we could end up generating worse plans when there is some subsequent Merge Join above this one that has join conditions that the query_pathkeys are not compatible with. Maybe your idea could be made to work in cases where bms_equal(joinrel->relids, root->all_baserels). In that case, we should not be processing any further joins and don't need to consider that as a factor for the scoring. David
Re: pg_auth_members.grantor is bunk
On Fri, Jun 24, 2022 at 4:46 PM Robert Haas wrote: > On Fri, Jun 24, 2022 at 4:30 PM David G. Johnston > wrote: > >> Upthread, I proposed that "drop role baz" should fail here > > > > I concur with this. > > > > I think that the grantor owns the grant, and that REASSIGNED OWNED should > > be able to move those grants to someone else. > > > > By extension, DROP OWNED should remove them. > > Interesting. I hadn't thought about changing the behavior of DROP > OWNED BY and REASSIGN OWNED BY. A quick experiment supports your > interpretation: This experiment was insufficiently thorough. I see now that, for other object types, DROP OWNED BY does work in the way that you propose, but REASSIGN OWNED BY does not. Here's a better test: rhaas=# create table foo(); CREATE TABLE rhaas=# create role bar; CREATE ROLE rhaas=# create role baz; CREATE ROLE rhaas=# grant select on table foo to bar with grant option; GRANT rhaas=# set role bar; SET rhaas=> grant select on table foo to baz; GRANT rhaas=> reset role; RESET rhaas=# drop role bar; ERROR: role "bar" cannot be dropped because some objects depend on it DETAIL: privileges for table foo rhaas=# create role quux; CREATE ROLE rhaas=# reassign owned by bar to quux; REASSIGN OWNED rhaas=# drop role bar; ERROR: role "bar" cannot be dropped because some objects depend on it DETAIL: privileges for table foo rhaas=# drop owned by bar; DROP OWNED rhaas=# drop role bar; DROP ROLE This behavior might look somewhat bizarre, but there's actually a good reason for it: the system guarantees that whoever is listed as the grantor of a privilege has the *current* right to grant that privilege. It can't categorically change the grantor of every privilege given by bar to quux because quux might not and in fact does not have the right to grant select on table foo to baz. Now, you might be thinking, ah, but what if the superuser performed the grant? They could cease to be the superuser later, and then the rule would be violated! But actually not, because a grant by the superuser is imputed to the table owner, who always has the right to grant all rights on the table, and if the table owner is ever changed, all the grants imputed to the old table owner are changed to have their grantor as the new table owner. Similarly, trying to revoke select, or the grant option on it, from bar would fail. So it looks pretty intentional, and pretty tightly-enforced, that every role listed as a grantor must be one which is currently able to grant that privilege. And that means that REASSIGN OWNED can't just do a blanket change to the recorded grantor. It could try to do so, I suppose, and just throw an error if it doesn't work out, but that might make REASSIGN OWNED fail a lot more often, which could suck. In any event, the implemented behavior is that REASSIGN OWNED does nothing about permissions, but DROP OWNED cascades to grantors. This is SORT OF documented, although the documentation only mentions that DROP OWNED cascades to privileges granted *to* the target role, and does not mention that it also cascades to privileges granted *by* the target role. The previous version of the patch makes both DROP OWNED and REASSIGN OWNED cascade to grantors, but I now think that, for consistency, I'd better look into changing it so that only DROP OWNED cascades. I think perhaps I should be using SHARED_DEPENDENCY_ACL instead of SHARED_DEPENDENCY_OWNER. -- Robert Haas EDB: http://www.enterprisedb.com
Re: shared-memory based stats collector - v70
Hi, On July 20, 2022 8:41:53 PM GMT+02:00, Greg Stark wrote: >On Wed, 20 Jul 2022 at 12:08, Tom Lane wrote: >> >> AFAIR, the previous stats collector implementation had no such provision >> either: it'd just keep adding hashtable entries as it received info about >> new objects. The only thing that's changed is that now those entries are >> in shared memory instead of process-local memory. We'd be well advised to >> be sure that memory can be swapped out under pressure, but otherwise I'm >> not seeing that things have gotten worse. > >Just to be clear I'm not looking for ways things have gotten worse. >Just trying to understand what I'm reading and I guess I came in with >assumptions that led me astray. > >But... adding entries as it received info about new objects isn't the >same as having info on everything. I didn't really understand how the >old system worked but if you had a very large schema but each session >only worked with a small subset did the local stats data ever absorb >info on the objects it never touched? Each backend only had stats for things it touched. But the stats collector read all files at startup into hash tables and absorbed all generated stats into those as well. >All that said -- having all objects loaded in shared memory makes my >work way easier. What are your trying to do? >It actually seems feasible to dump out all the >objects from shared memory and including objects from other databases >and if I don't need a consistent snapshot it even seems like it would >be possible to do that without having a copy of more than one stats >entry at a time in local memory. I hope that doesn't cause huge >contention on the shared hash table to be doing that regularly. The stats accessors now default to not creating a full snapshot of stats data at first access (but that's configurable). So yes, that behavior is possible. E.g. autovac now uses a single object access like you describe. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: make -C libpq check fails obscurely if tap tests are disabled
On 2022-07-20 We 13:23, Justin Pryzby wrote: > make -C ./src/interfaces/libpq check > PATH=... && @echo "TAP tests not enabled. Try configuring with > --enable-tap-tests" > /bin/sh: 1: @echo: not found > > make is telling the shell to run "@echo" , rather than running "echo" > silently. > > Since: > > commit 6b04abdfc5e0653542ac5d586e639185a8c61a39 > Author: Andres Freund > Date: Sat Feb 26 16:51:47 2022 -0800 > > Run tap tests in src/interfaces/libpq. Yeah. It's a bit ugly but I think the attached would fix it. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile index b5fd72a4ac..50aba04a4f 100644 --- a/src/interfaces/libpq/Makefile +++ b/src/interfaces/libpq/Makefile @@ -144,10 +144,18 @@ test-build: $(MAKE) -C test all check: test-build all +ifeq ($(enable_tap_tests),yes) PATH="$(CURDIR)/test:$$PATH" && $(prove_check) +else + $(prove_check) +endif installcheck: test-build all +ifeq ($(enable_tap_tests),yes) PATH="$(CURDIR)/test:$$PATH" && $(prove_installcheck) +else + $(prove_installcheck) +endif installdirs: installdirs-lib $(MKDIR_P) '$(DESTDIR)$(includedir)' '$(DESTDIR)$(includedir_internal)' '$(DESTDIR)$(datadir)'
Re: shared-memory based stats collector - v70
On Wed, 20 Jul 2022 at 12:08, Tom Lane wrote: > > AFAIR, the previous stats collector implementation had no such provision > either: it'd just keep adding hashtable entries as it received info about > new objects. The only thing that's changed is that now those entries are > in shared memory instead of process-local memory. We'd be well advised to > be sure that memory can be swapped out under pressure, but otherwise I'm > not seeing that things have gotten worse. Just to be clear I'm not looking for ways things have gotten worse. Just trying to understand what I'm reading and I guess I came in with assumptions that led me astray. But... adding entries as it received info about new objects isn't the same as having info on everything. I didn't really understand how the old system worked but if you had a very large schema but each session only worked with a small subset did the local stats data ever absorb info on the objects it never touched? All that said -- having all objects loaded in shared memory makes my work way easier. It actually seems feasible to dump out all the objects from shared memory and including objects from other databases and if I don't need a consistent snapshot it even seems like it would be possible to do that without having a copy of more than one stats entry at a time in local memory. I hope that doesn't cause huge contention on the shared hash table to be doing that regularly. -- greg
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
On Wed, Jul 20, 2022 at 12:50 PM Andres Freund wrote: > > On 2022-07-14 18:44:48 -0400, Melanie Plageman wrote: > > > @@ -1427,8 +1445,10 @@ pgstat_read_statsfile(void) > > FILE *fpin; > > int32 format_id; > > boolfound; > > + PgStat_BackendIOPathOps io_stats; > > const char *statfile = PGSTAT_STAT_PERMANENT_FILENAME; > > PgStat_ShmemControl *shmem = pgStatLocal.shmem; > > + PgStatShared_BackendIOPathOps *io_stats_shmem = >io_ops; > > > > /* shouldn't be called from postmaster */ > > Assert(IsUnderPostmaster || !IsPostmasterEnvironment); > > @@ -1486,6 +1506,22 @@ pgstat_read_statsfile(void) > > if (!read_chunk_s(fpin, >checkpointer.stats)) > > goto error; > > > > + /* > > + * Read IO Operations stats struct > > + */ > > + if (!read_chunk_s(fpin, _stats)) > > + goto error; > > + > > + io_stats_shmem->stat_reset_timestamp = > io_stats.stat_reset_timestamp; > > + > > + for (int i = 0; i < BACKEND_NUM_TYPES; i++) > > + { > > + PgStat_IOPathOps *stats = _stats.stats[i]; > > + PgStatShared_IOPathOps *stats_shmem = > _stats_shmem->stats[i]; > > + > > + memcpy(stats_shmem->data, stats->data, > sizeof(stats->data)); > > + } > > Why can't the data be read directly into shared memory? > > It is not the same lock. Each PgStatShared_IOPathOps has a lock so that they can be accessed individually (per BackendType in PgStatShared_BackendIOPathOps). It is optimized for the more common operation of flushing at the expense of the snapshot operation (which should be less common) and reset operation. > > +void > > +pgstat_io_ops_snapshot_cb(void) > > +{ > > + PgStatShared_BackendIOPathOps *all_backend_stats_shmem = > >io_ops; > > + PgStat_BackendIOPathOps *all_backend_stats_snap = > _ops; > > + > > + for (int i = 0; i < BACKEND_NUM_TYPES; i++) > > + { > > + PgStatShared_IOPathOps *stats_shmem = > _backend_stats_shmem->stats[i]; > > + PgStat_IOPathOps *stats_snap = > _backend_stats_snap->stats[i]; > > + > > + LWLockAcquire(_shmem->lock, LW_EXCLUSIVE); > > Why acquire the same lock repeatedly for each type, rather than once for > the whole? > > This is also because of having a LWLock in each PgStatShared_IOPathOps. Because I don't want a lock in the backend local stats, I have two data structures PgStatShared_IOPathOps and PgStat_IOPathOps. I thought it was odd to write out the lock to the file, so when persisting the stats, I write out the relevant data only and when reading it back in to shared memory, I read in the data member of PgStatShared_IOPathOps.
make -C libpq check fails obscurely if tap tests are disabled
make -C ./src/interfaces/libpq check PATH=... && @echo "TAP tests not enabled. Try configuring with --enable-tap-tests" /bin/sh: 1: @echo: not found make is telling the shell to run "@echo" , rather than running "echo" silently. Since: commit 6b04abdfc5e0653542ac5d586e639185a8c61a39 Author: Andres Freund Date: Sat Feb 26 16:51:47 2022 -0800 Run tap tests in src/interfaces/libpq. -- Justin
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Hi, On 2022-07-14 18:44:48 -0400, Melanie Plageman wrote: > Subject: [PATCH v26 1/4] Add BackendType for standalone backends > Subject: [PATCH v26 2/4] Remove unneeded call to pgstat_report_wal() LGTM. > Subject: [PATCH v26 3/4] Track IO operation statistics > @@ -978,8 +979,17 @@ ReadBuffer_common(SMgrRelation smgr, char > relpersistence, ForkNumber forkNum, > > bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : > BufHdrGetBlock(bufHdr); > > + if (isLocalBuf) > + io_path = IOPATH_LOCAL; > + else if (strategy != NULL) > + io_path = IOPATH_STRATEGY; > + else > + io_path = IOPATH_SHARED; Seems a bit ugly to have an if (isLocalBuf) just after an isLocalBuf ?. > + /* > + * When a strategy is in use, reused buffers from the > strategy ring will > + * be counted as allocations for the purposes of IO > Operation statistics > + * tracking. > + * > + * However, even when a strategy is in use, if a new > buffer must be > + * allocated from shared buffers and added to the ring, > this is counted > + * as a IOPATH_SHARED allocation. > + */ There's a bit too much duplication between the paragraphs... > @@ -628,6 +637,9 @@ pgstat_report_stat(bool force) > /* flush database / relation / function / ... stats */ > partial_flush |= pgstat_flush_pending_entries(nowait); > > + /* flush IO Operations stats */ > + partial_flush |= pgstat_flush_io_ops(nowait); Could you either add a note to the commit message that the stats file version needs to be increased, or just iclude that in the patch. > @@ -1427,8 +1445,10 @@ pgstat_read_statsfile(void) > FILE *fpin; > int32 format_id; > boolfound; > + PgStat_BackendIOPathOps io_stats; > const char *statfile = PGSTAT_STAT_PERMANENT_FILENAME; > PgStat_ShmemControl *shmem = pgStatLocal.shmem; > + PgStatShared_BackendIOPathOps *io_stats_shmem = >io_ops; > > /* shouldn't be called from postmaster */ > Assert(IsUnderPostmaster || !IsPostmasterEnvironment); > @@ -1486,6 +1506,22 @@ pgstat_read_statsfile(void) > if (!read_chunk_s(fpin, >checkpointer.stats)) > goto error; > > + /* > + * Read IO Operations stats struct > + */ > + if (!read_chunk_s(fpin, _stats)) > + goto error; > + > + io_stats_shmem->stat_reset_timestamp = io_stats.stat_reset_timestamp; > + > + for (int i = 0; i < BACKEND_NUM_TYPES; i++) > + { > + PgStat_IOPathOps *stats = _stats.stats[i]; > + PgStatShared_IOPathOps *stats_shmem = _stats_shmem->stats[i]; > + > + memcpy(stats_shmem->data, stats->data, sizeof(stats->data)); > + } Why can't the data be read directly into shared memory? > /* > +void > +pgstat_io_ops_snapshot_cb(void) > +{ > + PgStatShared_BackendIOPathOps *all_backend_stats_shmem = > >io_ops; > + PgStat_BackendIOPathOps *all_backend_stats_snap = > _ops; > + > + for (int i = 0; i < BACKEND_NUM_TYPES; i++) > + { > + PgStatShared_IOPathOps *stats_shmem = > _backend_stats_shmem->stats[i]; > + PgStat_IOPathOps *stats_snap = > _backend_stats_snap->stats[i]; > + > + LWLockAcquire(_shmem->lock, LW_EXCLUSIVE); Why acquire the same lock repeatedly for each type, rather than once for the whole? > + /* > + * Use the lock in the first BackendType's PgStat_IOPathOps to > protect the > + * reset timestamp as well. > + */ > + if (i == 0) > + all_backend_stats_snap->stat_reset_timestamp = > all_backend_stats_shmem->stat_reset_timestamp; Which also would make this look a bit less awkward. Starting to look pretty good... - Andres
Re: shared-memory based stats collector - v70
Hi, On 2022-07-20 12:08:35 -0400, Tom Lane wrote: > AFAIR, the previous stats collector implementation had no such provision > either: it'd just keep adding hashtable entries as it received info about > new objects. Yep. > The only thing that's changed is that now those entries are in shared > memory instead of process-local memory. We'd be well advised to be > sure that memory can be swapped out under pressure, but otherwise I'm > not seeing that things have gotten worse. FWIW, I ran a few memory usage benchmarks. Without stats accesses the memory usage with shared memory stats was sometimes below, sometimes above the "old" memory usage, depending on the number of objects. As soon as there's stats access, it's well below (that includes things like autovac workers). I think there's quite a bit of memory usage reduction potential around dsa.c - we occasionally end up with [nearly] unused dsm segments. Greetings, Andres Freund
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On 2022-Jul-20, Alvaro Herrera wrote: > On 2022-Jul-20, Alvaro Herrera wrote: > > > I also change the use of allow_invalid_pages to > > allow_in_place_tablespaces. We could add a > > separate GUC for this, but it seems overengineering. > > Oh, but allow_in_place_tablespaces doesn't exist in versions 14 and > older, so this strategy doesn't really work. ... and get_dirent_type is new in 14, so that'll be one more hurdle. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Cuando no hay humildad las personas se degradan" (A. Christie)
Re: Logical insert/update/delete WAL records for custom table AMs
On Mon, 2022-03-21 at 17:43 -0700, Andres Freund wrote: > Currently this doesn't apply: > http://cfbot.cputube.org/patch_37_3394.log Withdrawn for now. With custom WAL resource managers this is less important to me. I still think it has value, and I'm willing to work on it if more use cases come forward. Regards, Jeff Davis
Re: shared-memory based stats collector - v70
Melanie Plageman writes: > On Wed, Jul 20, 2022 at 11:35 AM Greg Stark wrote: >> It seems like if we really think the total number of database objects >> is reasonably limited to scales that fit in RAM there would be a much >> simpler database design that would just store the catalog tables in >> simple in-memory data structures and map them all on startup without >> doing all the work Postgres does to make relational storage scale. > I think efforts to do such a thing have gotten caught up in solving > issues around visibility and managing the relationship between local and > global caches [1]. It doesn't seem like the primary technical concern > was memory usage. AFAIR, the previous stats collector implementation had no such provision either: it'd just keep adding hashtable entries as it received info about new objects. The only thing that's changed is that now those entries are in shared memory instead of process-local memory. We'd be well advised to be sure that memory can be swapped out under pressure, but otherwise I'm not seeing that things have gotten worse. regards, tom lane
Re: shared-memory based stats collector - v70
Hi, On 2022-07-20 11:35:13 -0400, Greg Stark wrote: > Is it true that the shared memory allocation contains the hash table > entry and body of every object in every database? Yes. However, note that that was already the case with the old stats collector - it also kept everything in memory. In addition every read access to stats loaded a copy of the stats (well of the global stats and the relevant per-database stats). It might be worth doing something fancier at some point - the shared memory stats was already a huge effort, cramming yet another change in there would pretty much have guaranteed that it'd fail. Greetings, Andres Freund
Re: pgsql: Default to hidden visibility for extension libraries where possi
On Wed, 20 Jul 2022 at 16:12, Tom Lane wrote: > Alvaro Herrera writes: > > On 2022-Jul-20, Tom Lane wrote: > >> I'll try to do some research later today to identify anything else > >> we need to mark in plpgsql. I recall doing some work specifically > >> creating functions for pldebugger's use, but I'll need to dig. > > > I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't > > expose functions directly, but through plpgsql_plugin_ptr. Maybe that > > one does need to be made PGDLLEXPORT, since currently it isn't. > > After some experimentation, it does not need to be marked: pldebugger > gets at that via find_rendezvous_variable(), so there is no need for > any explicit linkage at all between plpgsql.so and plugin_debugger.so. > > Along the way, I made a quick hack to get pldebugger to load into > v15/HEAD. It lacks #ifdef's which'd be needed so that it'd still > compile against older branches, but perhaps this'll save someone > some time. > Thanks Tom - I've pushed that patch with the relevant #ifdefs added. -- Dave Page PostgreSQL Core Team http://www.postgresql.org/
Re: shared-memory based stats collector - v70
On Wed, Jul 20, 2022 at 11:35 AM Greg Stark wrote: > On the one hand the rest of Postgres seems to be designed on the > assumption that the number of tables and database objects is limited > only by disk space. The catalogs are stored in relational storage > which is read through the buffer cache. On the other hand it's true > that syscaches don't do expire entries (though I think the assumption > is that no one backend touches very much). > > It seems like if we really think the total number of database objects > is reasonably limited to scales that fit in RAM there would be a much > simpler database design that would just store the catalog tables in > simple in-memory data structures and map them all on startup without > doing all the work Postgres does to make relational storage scale. > I think efforts to do such a thing have gotten caught up in solving issues around visibility and managing the relationship between local and global caches [1]. It doesn't seem like the primary technical concern was memory usage. [1] https://www.postgresql.org/message-id/flat/4E72940DA2BF16479384A86D54D0988A567B9245%40G01JPEXMBKW04
Re: Allow placeholders in ALTER ROLE w/o superuser
Kyotaro Horiguchi writes: > At Tue, 19 Jul 2022 09:53:39 -0700, Nathan Bossart > wrote in >> Hm. I would expect ALTER ROLE to store the PGC_SUSET context when executed >> by a superuser or a role with privileges via pg_parameter_acl. Storing all >> placeholder GUC settings as PGC_USERSET would make things more restrictive >> than they are today. For example, it would no longer be possible to apply >> any ALTER ROLE settings from superusers for placeholders that later become >> custom GUCS. > Returning to the topic, that operation can be allowed in PG15, having > being granted by superuser using the GRANT SET ON PARMETER command. I think that 13d838815 has completely changed the terms that this discussion needs to be conducted under. It seems clear to me now that if you want to relax this only-superusers restriction, what you have to do is store the OID of the role that issued ALTER ROLE/DB SET, and then apply the same checks that would be used in the ordinary case where a placeholder is being filled in after being set intra-session. That is, we'd no longer assume that a value coming from pg_db_role_setting was set with superuser privileges, but we'd know exactly who did set it. This might also tie into Nathan's question in another thread about exactly what permissions should be required to issue ALTER ROLE/DB SET. In particular I'm wondering if different permissions should be needed to override an existing entry than if there is no existing entry. If not, we could find ourselves downgrading a superuser-set entry to a non-superuser-set entry, which might have bad consequences later (eg, by rendering the entry nonfunctional because when we actually load the extension we find out the GUC is SUSET). Possibly related to this: I felt while working on 13d838815 that PGC_SUSET and PGC_SU_BACKEND should be usable as GucContext values for GUC variables, indicating that the GUC requires special privileges to be set, but we should no longer use them as passed-in GucContext values. That is, we should remove privilege tests from the call sites, like this: (void) set_config_option(stmt->name, ExtractSetVariableArgs(stmt), -(superuser() ? PGC_SUSET : PGC_USERSET), +PGC_USERSET, PGC_S_SESSION, action, true, 0, false); and instead put that behavior inside set_config_option_ext, which would want to look at superuser_arg(srole) instead, and indeed might not need to do anything because pg_parameter_aclcheck would subsume the test. I didn't pursue this further because it wasn't essential to fixing the bug. But it seems relevant here, because that line of thought leads to the conclusion that storing PGC_SUSET vs PGC_USERSET is entirely the wrong approach. There is a bunch of infrastructure work that has to be done if anyone wants to make this happen: * redesign physical representation of pg_db_role_setting * be sure to clean up if a role mentioned in pg_db_role_setting is dropped * pg_dump would need to be taught to dump the state of affairs correctly. regards, tom lane
Re: shared-memory based stats collector - v70
So I'm finally wrapping my head around this new code. There is something I'm surprised by that perhaps I'm misreading or perhaps I shouldn't be surprised by, not sure. Is it true that the shared memory allocation contains the hash table entry and body of every object in every database? I guess I was assuming I would find some kind of LRU cache which loaded data from disk on demand. But afaict it loads everything on startup and then never loads from disk later. The disk is purely for recovering state after a restart. On the one hand the rest of Postgres seems to be designed on the assumption that the number of tables and database objects is limited only by disk space. The catalogs are stored in relational storage which is read through the buffer cache. On the other hand it's true that syscaches don't do expire entries (though I think the assumption is that no one backend touches very much). It seems like if we really think the total number of database objects is reasonably limited to scales that fit in RAM there would be a much simpler database design that would just store the catalog tables in simple in-memory data structures and map them all on startup without doing all the work Postgres does to make relational storage scale.
Re: Remove useless arguments in ReadCheckpointRecord().
On Wed, Jul 20, 2022 at 8:21 PM Fujii Masao wrote: > > Hi, > > I'd like to propose to remove "whichChkpt" and "report" arguments in > ReadCheckpointRecord(). "report" is obviously useless because it's always > true, i.e., there are two callers of the function and they always specify > true as "report". Yes, the report parameter is obvious to delete. The commit 1d919de5eb removed the only call with the report parameter as false. > "whichChkpt" indicates where the specified checkpoint location came from, > pg_control or backup_label. This information is used to log different > messages as follows. > > switch (whichChkpt) > { > case 1: > ereport(LOG, > (errmsg("invalid primary > checkpoint link in control file"))); > break; > default: > ereport(LOG, > (errmsg("invalid checkpoint > link in backup_label file"))); > break; > } > return NULL; > ... > switch (whichChkpt) > { > case 1: > ereport(LOG, > (errmsg("invalid primary > checkpoint record"))); > break; > default: > ereport(LOG, > (errmsg("invalid checkpoint > record"))); > break; > } > return NULL; > ... > > But the callers of ReadCheckpointRecord() already output different log > messages depending on where the invalid checkpoint record came from. So even > if ReadCheckpointRecord() doesn't use "whichChkpt", i.e., use the same log > message in both pg_control and backup_label cases, users can still identify > where the invalid checkpoint record came from, by reading the log message. > > Also when whichChkpt = 0, "primary checkpoint" is used in the log message and > sounds confusing because, as far as I recall correctly, we removed the > concept of primary and secondary checkpoints before. Yes, using "primary checkpoint" confuses, as we usually refer primary in the context of replication and HA. > Therefore I think that it's better to remove "whichChkpt" argument in > ReadCheckpointRecord(). > > Patch attached. Thought? How about we transform the following messages into something like below? (errmsg("could not locate a valid checkpoint record"))); after ReadCheckpointRecord() for control file cases to "could not locate valid checkpoint record in control file" (errmsg("could not locate required checkpoint record"), after ReadCheckpointRecord() for backup_label case to "could not locate valid checkpoint record in backup_label file" The above messages give more meaningful and unique info to the users. May be unrelated, IIRC, for the errors like ereport(PANIC, (errmsg("could not locate a valid checkpoint record"))); we wanted to add a hint asking users to consider running pg_resetwal to fix the issue. The error for ReadCheckpointRecord() in backup_label file cases, already gives a hint errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" .. Adding the hint of running pg_resetwal (of course with a caution that it can cause inconsistency in the data and use it as a last resort as described in the docs) helps users and support engineers a lot to mitigate the server down cases quickly. Regards, Bharath Rupireddy.
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On 2022-Jul-20, Alvaro Herrera wrote: > I also change the use of allow_invalid_pages to > allow_in_place_tablespaces. We could add a > separate GUC for this, but it seems overengineering. Oh, but allow_in_place_tablespaces doesn't exist in versions 14 and older, so this strategy doesn't really work. I see the following alternatives: 1. not backpatch this fix to 14 and older 2. use a different GUC; either allow_invalid_pages as previously suggested, or create a new one just for this purpose 3. not provide any overriding mechanism in versions 14 and older -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Always assume the user will do much worse than the stupidest thing you can imagine."(Julien PUYDT)
Re: pgsql: Default to hidden visibility for extension libraries where possi
Alvaro Herrera writes: > On 2022-Jul-20, Tom Lane wrote: >> I'll try to do some research later today to identify anything else >> we need to mark in plpgsql. I recall doing some work specifically >> creating functions for pldebugger's use, but I'll need to dig. > I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't > expose functions directly, but through plpgsql_plugin_ptr. Maybe that > one does need to be made PGDLLEXPORT, since currently it isn't. After some experimentation, it does not need to be marked: pldebugger gets at that via find_rendezvous_variable(), so there is no need for any explicit linkage at all between plpgsql.so and plugin_debugger.so. Along the way, I made a quick hack to get pldebugger to load into v15/HEAD. It lacks #ifdef's which'd be needed so that it'd still compile against older branches, but perhaps this'll save someone some time. regards, tom lane diff --git a/plugin_debugger.c b/plugin_debugger.c index 6620021..1bd2057 100644 --- a/plugin_debugger.c +++ b/plugin_debugger.c @@ -114,6 +114,8 @@ static debugger_language_t *debugger_languages[] = { NULL }; +static shmem_request_hook_type prev_shmem_request_hook = NULL; + /** * Function declarations **/ @@ -124,6 +126,7 @@ void _PG_init( void );/* initialize this module when we are dynamically load * Local (hidden) function prototypes **/ +static void pldebugger_shmem_request( void ); //static char ** fetchArgNames( PLpgSQL_function * func, int * nameCount ); static void* writen( int peer, void * src, size_t len ); static bool connectAsServer( void ); @@ -154,6 +157,15 @@ void _PG_init( void ) for (i = 0; debugger_languages[i] != NULL; i++) debugger_languages[i]->initialize(); + prev_shmem_request_hook = shmem_request_hook; + shmem_request_hook = pldebugger_shmem_request; +} + +static void pldebugger_shmem_request( void ) +{ + if (prev_shmem_request_hook) + prev_shmem_request_hook(); + reserveBreakpoints(); dbgcomm_reserve(); }
Re: pgsql: Default to hidden visibility for extension libraries where possi
Hi, On July 20, 2022 4:20:04 PM GMT+02:00, Tom Lane wrote: >Alvaro Herrera writes: >> I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't >> expose functions directly, but through plpgsql_plugin_ptr. Maybe that >> one does need to be made PGDLLEXPORT, since currently it isn't. > >Hm. Not sure if the rules are the same for global variables as >they are for functions, but if so, yeah ... They're the same on the export side. On windows the rules for linking to variables are stricter (they need declspec dllimport), but that doesn't matter for dlsym style stuff. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
[PATCH] Renumber confusing value for GUC_UNIT_BYTE
The GUC units are currently defined like: #define GUC_UNIT_KB 0x1000 /* value is in kilobytes */ #define GUC_UNIT_BLOCKS 0x2000 /* value is in blocks */ #define GUC_UNIT_XBLOCKS0x3000 /* value is in xlog blocks */ #define GUC_UNIT_MB 0x4000 /* value is in megabytes */ #define GUC_UNIT_BYTE 0x8000 /* value is in bytes */ #define GUC_UNIT_MEMORY 0xF000 /* mask for size-related units */ #define GUC_UNIT_MS0x1 /* value is in milliseconds */ #define GUC_UNIT_S 0x2 /* value is in seconds */ #define GUC_UNIT_MIN 0x3 /* value is in minutes */ #define GUC_UNIT_TIME 0xF /* mask for time-related units */ 0x3000 and 0x3 seemed wrong, since they're a combination of other flags rather than being an independant power of two. But actually, these aren't flags: they're tested in a "case" statement for equality, not in a bitwise & test. So the outlier is actually 0x8000, added at: |commit 6e7baa322773ff8c79d4d8883c99fdeff5bfa679 |Author: Andres Freund |Date: Tue Sep 12 12:13:12 2017 -0700 | |Introduce BYTES unit for GUCs. It looks like that originated here: https://www.postgresql.org/message-id/CAOG9ApEu8bXVwBxkOO9J7ZpM76TASK_vFMEEiCEjwhMmSLiaqQ%40mail.gmail.com commit 162e4838103e7957cccfe7868fc28397b55ca1d7 Author: Justin Pryzby Date: Wed Jul 20 09:27:24 2022 -0500 Renumber confusing value for GUC_UNIT_BYTE It had a power-of-two value, which looks right, and causes the other values which aren't powers-of-two to look wrong. But this is tested for equality and not a bitwise test. See also: 6e7baa322773ff8c79d4d8883c99fdeff5bfa679 https://www.postgresql.org/message-id/CAOG9ApEu8bXVwBxkOO9J7ZpM76TASK_vFMEEiCEjwhMmSLiaqQ%40mail.gmail.com diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h index 4d0920c42e2..be928fac881 100644 --- a/src/include/utils/guc.h +++ b/src/include/utils/guc.h @@ -219,11 +219,12 @@ typedef enum #define GUC_DISALLOW_IN_AUTO_FILE 0x0800 /* can't set in * PG_AUTOCONF_FILENAME */ +/* GUC_UNIT_* are not flags - they're tested for equality */ #define GUC_UNIT_KB0x1000 /* value is in kilobytes */ #define GUC_UNIT_BLOCKS0x2000 /* value is in blocks */ #define GUC_UNIT_XBLOCKS 0x3000 /* value is in xlog blocks */ #define GUC_UNIT_MB0x4000 /* value is in megabytes */ -#define GUC_UNIT_BYTE 0x8000 /* value is in bytes */ +#define GUC_UNIT_BYTE 0x5000 /* value is in bytes */ #define GUC_UNIT_MEMORY0xF000 /* mask for size-related units */ #define GUC_UNIT_MS 0x1 /* value is in milliseconds */
Remove useless arguments in ReadCheckpointRecord().
Hi, I'd like to propose to remove "whichChkpt" and "report" arguments in ReadCheckpointRecord(). "report" is obviously useless because it's always true, i.e., there are two callers of the function and they always specify true as "report". "whichChkpt" indicates where the specified checkpoint location came from, pg_control or backup_label. This information is used to log different messages as follows. switch (whichChkpt) { case 1: ereport(LOG, (errmsg("invalid primary checkpoint link in control file"))); break; default: ereport(LOG, (errmsg("invalid checkpoint link in backup_label file"))); break; } return NULL; ... switch (whichChkpt) { case 1: ereport(LOG, (errmsg("invalid primary checkpoint record"))); break; default: ereport(LOG, (errmsg("invalid checkpoint record"))); break; } return NULL; ... But the callers of ReadCheckpointRecord() already output different log messages depending on where the invalid checkpoint record came from. So even if ReadCheckpointRecord() doesn't use "whichChkpt", i.e., use the same log message in both pg_control and backup_label cases, users can still identify where the invalid checkpoint record came from, by reading the log message. Also when whichChkpt = 0, "primary checkpoint" is used in the log message and sounds confusing because, as far as I recall correctly, we removed the concept of primary and secondary checkpoints before. Therefore I think that it's better to remove "whichChkpt" argument in ReadCheckpointRecord(). Patch attached. Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATIONFrom a33e557a18cf5c45bbcf47f1cf92e26998f1cd67 Mon Sep 17 00:00:00 2001 From: Fujii Masao Date: Wed, 20 Jul 2022 23:06:44 +0900 Subject: [PATCH] Remove useless arguments in ReadCheckpointRecord(). --- src/backend/access/transam/xlogrecovery.c | 84 --- 1 file changed, 15 insertions(+), 69 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 5d6f1b5e46..e383c2123a 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -422,8 +422,8 @@ static XLogPageReadResult WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, XLogRecPtr replayLSN, bool nonblocking); static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr); -static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, XLogRecPtr RecPtr, - int whichChkpt, bool report, TimeLineID replayTLI); +static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, + XLogRecPtr RecPtr, TimeLineID replayTLI); static bool rescanLatestTimeLine(TimeLineID replayTLI, XLogRecPtr replayLSN); static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli, XLogSource source, bool notfoundOk); @@ -605,7 +605,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, * When a backup_label file is present, we want to roll forward from * the checkpoint it identifies, rather than using pg_control. */ - record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 0, true, + record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, CheckPointTLI); if (record != NULL) { @@ -744,7 +744,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID; RedoStartLSN = ControlFile->checkPointCopy.redo; RedoStartTLI = ControlFile->checkPointCopy.ThisTimeLineID; - record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 1, true, + record =
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Hi, > > - Add a new class of invalidation callbacks: Today, if we do ALTER TABLE > or CREATE INDEX on a table, the CacheRegisterRelcacheCallback helps us to > re-create the cache entries. In this case, as far as I can see, we need a > callback that is called when table "ANALYZE"d, because that is when the > statistics change. That is the time picking a new index makes sense. > > However, that seems like adding another dimension to this patch, which I > can try but also see that committing becomes even harder. > > > > This idea sounds worth investigating. I see that this will require > more work but OTOH, we can't allow the existing system to regress > especially because depending on workload it might regress badly. Just to note if that is not clear: This patch avoids (or at least aims to avoid assuming no bugs) changing the behavior of the existing systems with PRIMARY KEY or UNIQUE index. In that case, we still use the relevant indexes. > We > can create a patch for this atop the base patch for easier review/test > but I feel we need some way to address this point. > > One another idea could be to re-calculate the index, say after *N* updates/deletes for the table. We may consider using subscription_parameter for getting N -- with a good default, or even hard-code into the code. I think the cost of re-calculating should really be pretty small compared to the other things happening during logical replication. So, a sane default might work? If you think the above doesn't work, I can try to work on a separate patch which adds something like "analyze invalidation callback". > > > > - Ask users to manually pick the index they want to use: Currently, the > main complexity of the patch comes with the planner related code. In fact, > if you look into the logical replication related changes, those are > relatively modest changes. If we can drop the feature that Postgres picks > the index, and provide a user interface to set the indexes per table in the > subscription, we can probably have an easier patch to review & test. For > example, we could add `ALTER SUBSCRIPTION sub ALTER TABLE t USE INDEX i` > type of an API. This also needs some coding, but probably much simpler than > the current code. And, obviously, this pops up the question of can users > pick the right index? > > > > I think picking the right index is one point and another is what if > the subscription has many tables (say 10K or more), doing it for > individual tables per subscription won't be fun. Also, users need to > identify which tables belong to a particular subscription, now, users > can find the same via pg_subscription_rel or some other way but doing > this won't be straightforward for users. So, my inclination would be > to pick the right index automatically rather than getting the input > from the user. > Yes, all makes sense. > > Now, your point related to planner code in the patch bothers me as > well but I haven't studied the patch in detail to provide any > alternatives at this stage. Do you have any other ideas to make it > simpler or solve this problem in some other way? > > One idea I tried earlier was to go over the existing indexes and on the table, then get the IndexInfo via BuildIndexInfo(). And then, try to find a good heuristic to pick an index. In the end, I felt like that is doing a sub-optimal job, requiring a similar amount of code of the current patch, and still using the similar infrastructure. My conclusion for that route was I should either use a very simple heuristic (like pick the index with the most columns) and have a suboptimal index pick, OR use a complex heuristic with a reasonable index pick. And, the latter approach converged to the planner code in the patch. Do you think the former approach is acceptable? Thanks, Onder
Re: pgsql: Default to hidden visibility for extension libraries where possi
Alvaro Herrera writes: > I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't > expose functions directly, but through plpgsql_plugin_ptr. Maybe that > one does need to be made PGDLLEXPORT, since currently it isn't. Hm. Not sure if the rules are the same for global variables as they are for functions, but if so, yeah ... regards, tom lane
Re: Re: pgsql: Default to hidden visibility for extension libraries where possi
Andres Freund writes: > On July 20, 2022 3:54:03 PM GMT+02:00, Robert Haas > wrote: >> The name PGDLLEXPORT is actually slightly misleading, now, because >> there's no longer anything about it that is specific to DLLs. > How so? Right now it's solely used for making symbols in DLLs as exported? I suspect Robert is reading "DLL" as meaning only a Windows thing. You're right, if you read it as a generic term for loadable libraries, it's more or less applicable everywhere. regards, tom lane
Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Hi Pavel, > Rebased. PFA v13. > Your reviews are very much welcome! I noticed that this patch is in "Needs Review" state and it has been stuck for some time now, so I decided to take a look. ``` +SELECT bt_index_parent_check('bttest_a_idx', true, true, true); +SELECT bt_index_parent_check('bttest_b_idx', true, false, true); `` 1. This "true, false, true" sequence is difficult to read. I suggest we use named arguments here. 2. I believe there are some minor issues with the comments. E.g. instead of: - First key on next page is same - Make values 768 and 769 looks equal I would write: - The first key on the next page is the same - Make values 768 and 769 look equal There are many little errors like these. ``` +# Copyright (c) 2021, PostgreSQL Global Development Group ``` 3. Oh no. The copyright has expired! ``` + true. When checkunique + is true bt_index_check will ``` 4. This piece of documentation was copy-pasted between two functions without change of the function name. Other than that to me the patch looks in pretty good shape. Here is v14 where I fixed my own nitpicks, with the permission of Pavel given offlist. If no one sees any other defects I'm going to change the status of the patch to "Ready to Committer" in a short time. -- Best regards, Aleksander Alekseev v14-0001-Add-option-for-amcheck-and-pg_amcheck-to-check-u.patch Description: Binary data
Re: pgsql: Default to hidden visibility for extension libraries where possi
Hi, On July 20, 2022 3:54:03 PM GMT+02:00, Robert Haas wrote: >On Wed, Jul 20, 2022 at 9:39 AM Tom Lane wrote: >> ISTM that a comment pointing out that the functions marked PGDLLEXPORT >> are meant to be externally accessible should be sufficient. > >The name PGDLLEXPORT is actually slightly misleading, now, because >there's no longer anything about it that is specific to DLLs. How so? Right now it's solely used for making symbols in DLLs as exported? Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
On Wed, Jul 20, 2022 at 12:37 AM Andres Freund wrote: > On 2022-07-19 20:40:11 +0900, Amit Langote wrote: > > About that, I was wondering if the blocks in llvm_compile_expr() need > > to be hand-coded to match what's added in ExecInterpExpr() or if I've > > missed some tool that can be used instead? > > The easiest way is to just call an external function for the implementation of > the step. But yes, otherwise you need to handcraft it. Ok, thanks. So I started updating llvm_compile_expr() for handling the new ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly realized that code could have been consolidated into less code, or IOW, into fewer new ExprEvalSteps. So, I refactored things that way and am now retrying adding the code to llvm_compile_expr() based on new, better consolidated, code. Here's the updated version, without the llvm pieces, in case you'd like to look at it even in this state. I'll post a version with llvm pieces filled in tomorrow. (I have merged the different patches into one for convenience.) -- Thanks, Amit Langote EDB: http://www.enterprisedb.com v3-0001-Some-improvements-to-JsonExpr-evaluation.patch Description: Binary data
Re: pgsql: Default to hidden visibility for extension libraries where possi
Robert Haas writes: > On Wed, Jul 20, 2022 at 9:39 AM Tom Lane wrote: >> ISTM that a comment pointing out that the functions marked PGDLLEXPORT >> are meant to be externally accessible should be sufficient. > The name PGDLLEXPORT is actually slightly misleading, now, because > there's no longer anything about it that is specific to DLLs. True, but the mess of changing it seems to outweigh any likely clarity gain. As long as there's adequate commentary about what it means, I'm okay with the existing naming. regards, tom lane
Re: pgsql: Default to hidden visibility for extension libraries where possi
On 2022-Jul-20, Tom Lane wrote: > I'll try to do some research later today to identify anything else > we need to mark in plpgsql. I recall doing some work specifically > creating functions for pldebugger's use, but I'll need to dig. I suppose you're probably thinking of commit 53ef6c40f1e7; that didn't expose functions directly, but through plpgsql_plugin_ptr. Maybe that one does need to be made PGDLLEXPORT, since currently it isn't. That was also reported by Pavel. He was concerned about plpgsql_check, though, not pldebugger. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'. After collecting 500 such letters, he mused, a university somewhere in Arizona would probably grant him a degree. (Don Knuth)
Re: Memory leak fix in psql
Japin Li writes: > Thanks for updating the patch. It looks good. However, it cannot be > applied on 14 stable. The attached patches are for 10-14. While I think this is good cleanup, I'm doubtful that any of these leaks are probable enough to be worth back-patching into stable branches. The risk of breaking something should not be neglected. regards, tom lane
Re: pgsql: Default to hidden visibility for extension libraries where possi
On Wed, Jul 20, 2022 at 9:39 AM Tom Lane wrote: > ISTM that a comment pointing out that the functions marked PGDLLEXPORT > are meant to be externally accessible should be sufficient. The name PGDLLEXPORT is actually slightly misleading, now, because there's no longer anything about it that is specific to DLLs. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Remove fls(), use pg_bitutils.h facilities instead?
Thomas Munro writes: > That double eval macro wasn't nice. This time with a static inline function. Seems like passing a size_t to pg_leftmost_one_pos32 isn't great. It was just as wrong before (if the caller-supplied argument is indeed a size_t), but no time like the present to fix it. We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t as the appropriate one of pg_leftmost_one_pos32/64, perhaps. regards, tom lane
Re: pgsql: Default to hidden visibility for extension libraries where possi
Alvaro Herrera writes: > No immediate plans for splitting plpgsql.h, so if anyone wants to take a > stab at that, be my guest. ISTM that a comment pointing out that the functions marked PGDLLEXPORT are meant to be externally accessible should be sufficient. I'll try to do some research later today to identify anything else we need to mark in plpgsql. I recall doing some work specifically creating functions for pldebugger's use, but I'll need to dig. regards, tom lane
Re: GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.
Hello On 2022-Jul-20, Amul Sul wrote: > If you look at GetFlushRecPtr() function the OUT parameter for > TimeLineID is optional and this is not only one, see > GetWalRcvFlushRecPtr(), GetXLogReplayRecPtr(), etc. > > I think we have missed that for GetStandbyFlushRecPtr(), to be > inlined, we should change this as follow: This is something we decide mostly on a case-by-case basis. There's no fixed rule that all out params have to be optional. If anything is improved by this change, let's see what it is. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/
Re: Remove fls(), use pg_bitutils.h facilities instead?
Thomas Munro writes: > On Wed, Jul 20, 2022 at 4:52 PM Tom Lane wrote: >> I think we could probably just drop fls() entirely. It doesn't look >> to me like any of the existing callers expect a zero argument, so they >> could be converted to use pg_leftmost_one_pos32() pretty trivially. > That was not true for the case in contiguous_pages_to_segment_bin(), > in dsa.c. If it's just one place like that (and, hrrm, curiously > there is an open issue about binning quality on my to do list...), How is it sane to ask for a segment bin for zero pages? Seems like something should have short-circuited such a case well before here. regards, tom lane
Re: PATCH: Add Table Access Method option to pgbench
Hi! On Tue, Jul 19, 2022 at 4:47 AM Michael Paquier wrote: > On Mon, Jul 18, 2022 at 01:53:21PM +0300, Alexander Korotkov wrote: > > Looks good to me as well. I'm going to push this if no objections. > > FWIW, I find the extra mention of PGOPTIONS with the specific point of > table AMs added within the part of the environment variables a bit > confusing, because we already mention PGOPTIONS for serializable > transactions a bit down. Hence, my choice would be the addition of an > extra paragraph in the "Notes", named "Table Access Methods", just > before or after "Good Practices". My 2c. Thank you. Pushed applying the suggestion above. -- Regards, Alexander Korotkov
Re: Allowing REINDEX to have an optional name
On Tue, Jul 19, 2022 at 01:13:34PM +0900, Michael Paquier wrote: > > It looks like you named the table "toast_relfilenodes", but then also store > > to it data for non-toast tables. > > How about naming that index_relfilenodes? One difference with what I > posted previously and 5fb5b6 is the addition of an extra regclass that > stores the parent table, for reference in the output. Looks fine > - 'CREATE TABLE toast_relfilenodes (parent regclass, indname regclass, > relfilenode oid);' > + 'CREATE TABLE index_relfilenodes (parent regclass, indname regclass, > relfilenode oid);' -- Justi
Re: Making pg_rewind faster
Hi Michael, Thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server. I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached. Thanks, Justin From: Michael Paquier Sent: Tuesday, July 19, 2022 1:36 AM To: Justin Kwan Cc: Tom Lane; pgsql-hackers; vignesh; jk...@cloudflare.com; vignesh ravichandran; hlinn...@iki.fi Subject: Re: Making pg_rewind faster On Mon, Jul 18, 2022 at 05:14:00PM +, Justin Kwan wrote: > Thank you for taking a look at this and that sounds good. I will > send over a patch compatible with Postgres v16. +$node_2->psql( + 'postgres', + "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/00010003');", + stdout => \my $last_common_tli1_wal_last_modified_at); Please note that you should not rely on the FS-level stats for anything that touches the WAL segments. A rough guess about what you could here to make sure that only the set of WAL segments you are looking for is being copied over would be to either: - Scan the logs produced by pg_rewind and see if the segments are copied or not, depending on the divergence point (aka the last checkpoint before WAL forked). - Clean up pg_wal/ in the target node before running pg_rewind, checking that only the segments you want are available once the operation completes. -- Michael v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patch Description: v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patch
Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages
On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi wrote: > > At Wed, 20 Jul 2022 10:02:22 +0530, Bharath Rupireddy > wrote in > > Done. PSA v6 patch set. > > Thanks! > > -Specifies the minimum size of past log file segments kept in the > +Specifies the minimum size of past WAL files kept in the > > -log file by reducing the value of this parameter. On a system with > +file by reducing the value of this parameter. On a system with > > Looks fine. And postgresq.conf.sample has the following lines: > > #archive_library = '' # library to use to archive a logfile segment > > #archive_command = '' # command to use to archive a logfile segment > > #archive_timeout = 0# force a logfile segment switch after this > > #restore_command = '' # command to use to restore an archived > logfile segment > > Aren't they need the same fix? Indeed. Thanks. Now, they are in sync with their peers in .conf.sample file as well as description in guc.c. PSA v7 patch set. Regards, Bharath Rupireddy. v7-0001-Use-WAL-segment-instead-of-log-segment.patch Description: Binary data v7-0002-Consistently-use-WAL-file-s-in-docs.patch Description: Binary data
Re: Windows default locale vs initdb
On Wed, Jul 20, 2022 at 10:27 PM Juan José Santamaría Flecha wrote: > On Tue, Jul 19, 2022 at 4:47 AM Thomas Munro wrote: >> As for whether "accordingly" still applies, by the logic of of >> win32_langinfo()... Windows still considers WIN1252 to be the default >> ANSI code page for "en-US", though it'd work with UTF-8 too. I'm not >> sure what to make of that. The goal here was to give Windows users >> good defaults, but WIN1252 is probably not what most people actually >> want. Hmph. > > > Still, WIN1252 is not the wrong answer for what we are asking. Even if you > enable UTF-8 support [1], the system will use the current default Windows > ANSI code page (ACP) for the locale and UTF-8 for the code page. I'm still confused about what that means. Suppose we decided to insist by adding a ".UTF-8" suffix to the name, as that page says we can now that we're on Windows 10+, when building the default locale name (see experimental 0002 patch, attached). It initially seemed to have the right effect: The database cluster will be initialized with locale "en-US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". But then the Turkish i test in contrib/citext/sql/citext_utf8.sql failed[1]: SELECT 'i'::citext = 'İ'::citext AS t; t --- - t + f (1 row) About the pg_upgrade problem, maybe it's OK ... existing old format names should continue to work, but we can still remove the weird code that does locale name tweaking, right? pg_upgraded databases should contain fixed names (ie that were fixed by old initdb so should continue to work), and new clusters will get BCP 47 names. I don't really know, I was just playing with rough ideas by sending patches to CI here... [1] https://cirrus-ci.com/task/6423238052937728 From b007eb45e575956d5035f4152f72177abddc2762 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 19 Jul 2022 06:31:17 +1200 Subject: [PATCH v3 1/3] Default to BCP 47 locale in initdb on Windows. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Avoid selecting traditional Windows locale names written with English words, because they are unstable and not recommended for use in databases. Since setlocale() returns such names, on Windows use GetUserDefaultLocaleName() if the user didn't provide an explicit locale. Also update the documentation to recommend BCP 47 over the traditional names when providing explicit values to initdb. Reviewed-by: Juan José Santamaría Flecha Discussion: https://postgr.es/m/CA%2BhUKGJ%3DXThErgAQRoqfCy1bKPxXVuF0%3D2zDbB%2BSxDs59pv7Fw%40mail.gmail.com --- doc/src/sgml/charset.sgml | 10 -- src/bin/initdb/initdb.c | 31 +-- 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml index 445fd175d8..b656ca489f 100644 --- a/doc/src/sgml/charset.sgml +++ b/doc/src/sgml/charset.sgml @@ -83,8 +83,14 @@ initdb --locale=sv_SE system under what names depends on what was provided by the operating system vendor and what was installed. On most Unix systems, the command locale -a will provide a list of available locales. -Windows uses more verbose locale names, such as German_Germany -or Swedish_Sweden.1252, but the principles are the same. + + + +Windows uses BCP 47 language tags, like ICU. +For example, sv-SE represents Swedish as spoken in Sweden. +Windows also supports more verbose locale names based on English words, +such as German_Germany or Swedish_Sweden.1252, +but these are not recommended. diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 89b888eaa5..3af08b7b99 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -59,6 +59,10 @@ #include "sys/mman.h" #endif +#ifdef WIN32 +#include +#endif + #include "access/xlog_internal.h" #include "catalog/pg_authid_d.h" #include "catalog/pg_class_d.h" /* pgrminclude ignore */ @@ -2007,6 +2011,7 @@ locale_date_order(const char *locale) static void check_locale_name(int category, const char *locale, char **canonname) { + char *locale_copy; char *save; char *res; @@ -2022,10 +2027,30 @@ check_locale_name(int category, const char *locale, char **canonname) /* for setlocale() call */ if (!locale) - locale = ""; + { +#ifdef WIN32 + wchar_t wide_name[LOCALE_NAME_MAX_LENGTH]; + char name[LOCALE_NAME_MAX_LENGTH]; + + /* use Windows API to find the default in BCP47 format */ + if (GetUserDefaultLocaleName(wide_name, LOCALE_NAME_MAX_LENGTH) == 0) + pg_fatal("failed to get default locale name: error code %lu", + GetLastError()); + if (WideCharToMultiByte(CP_ACP, 0, wide_name, -1, name, +LOCALE_NAME_MAX_LENGTH, NULL, NULL) == 0) + pg_fatal("failed to convert locale name: error code %lu", + GetLastError()); + locale_copy = pg_strdup(name); +#else + /* use
Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?
Hi, After the commit [1], is it correct to say errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE))); in do_pg_backup_stop() when we hold the contents in backend global memory, not actually reading from backup_label file? However, it is correct to say that in read_backup_label. IMO, we can either say "invalid backup_label contents found" or we can be more descriptive and say "invalid "START WAL LOCATION" line found in backup_label content" and "invalid "BACKUP FROM" line found in backup_label content" and so on. Thoughts? [1] commit 39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a Author: Stephen Frost Date: Wed Apr 6 14:41:03 2022 -0400 Remove exclusive backup mode errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE))); Regards, Bharath Rupireddy.
Re: GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.
Hi Amul, > - *tli = replayTLI; > + if (tli) > + *tli = replayTLI; I would guess the difference here is that GetStandbyFlushRecPtr is static. It is used 3 times in walsender.c and in all cases the argument can't be NULL. So I'm not certain what we will gain from the proposed check. -- Best regards, Aleksander Alekseev
GetStandbyFlushRecPtr() : OUT param is not optional like elsewhere.
Hi, If you look at GetFlushRecPtr() function the OUT parameter for TimeLineID is optional and this is not only one, see GetWalRcvFlushRecPtr(), GetXLogReplayRecPtr(), etc. I think we have missed that for GetStandbyFlushRecPtr(), to be inlined, we should change this as follow: --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -3156,7 +3156,8 @@ GetStandbyFlushRecPtr(TimeLineID *tli) receivePtr = GetWalRcvFlushRecPtr(NULL, ); replayPtr = GetXLogReplayRecPtr(); - *tli = replayTLI; + if (tli) + *tli = replayTLI; Thoughts? -- Regards, Amul Sul EDB: http://www.enterprisedb.com
Re: Collect ObjectAddress for ATTACH DETACH PARTITION to use in event trigger
On Fri, Jul 15, 2022 at 11:39 AM houzj.f...@fujitsu.com wrote: > > On Friday, July 15, 2022 11:41 AM Michael Paquier wrote: > > Hi, > > > > On Fri, Jul 15, 2022 at 03:21:30AM +, kuroda.hay...@fujitsu.com wrote: > > > Sounds good. I grepped ATExecXXX() functions called in ATExecCmd(), > > > and I confirmed that all returned values have been collected except them. > > > > > > While checking test code test about EVENT TRIGGER, I found there were > > > no tests related with partitions in that. > > > How about adding them? > > > > Agreed. It would be good to track down what changes once those > > ObjectAddresses are collected. > > Thanks for having a look. It was a bit difficult to add a test for this. > Because we currently don't have a user function which can return these > collected ObjectAddresses for ALTER TABLE. And It seems we don't have tests > for > already collected ObjectAddresses as well :( > > The collected ObjectAddresses is in > "currentEventTriggerState->currentCommand->d.alterTable.subcmds.address" while > the public function pg_event_trigger_ddl_commands doesn't return these > information. It can only be used in user defined event trigger function (C > code). > > If we want to add some tests for both already existed and newly added > ObjectAddresses, we might need to add some test codes in test_ddl_deparse.c. > What do you think ? > Right. But, I noticed that get_altertable_subcmdtypes() doesn't handle AT_AttachPartition or AT_DetachPartition. We can handle those and at least have a test for those in test_ddl_deparse\sql\slter_table.sql. I know it is not directly related to your patch but that way we will at least have some tests for Attach/Detach partition and in the future when we extend it to test ObjectAddresses of subcommands that would be handy. Feel free to write a separate patch for the same. -- With Regards, Amit Kapila.
Re: Make name optional in CREATE STATISTICS
On Wed, 13 Jul 2022 at 08:07, Simon Riggs wrote: > > On Thu, 7 Jul 2022 at 11:58, Matthias van de Meent > wrote: > > A more correct version would be > > > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ] > > [(stat types)] > > There you go Thanks! I think this is ready for a committer, so I've marked it as such. Kind regards, Matthias van de Meent
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
v26 here. I spent some time fighting the readdir() stuff for Windows (so that get_dirent_type returns LNK for junction points) but couldn't make it to work and was unable to figure out why. So I ended up doing what do_pg_backup_start is already doing: an #ifdef to call pgwin32_is_junction instead. I remove the newly added path_is_symlink function, because I realized that it would mean an extra syscall everywhere other than Windows. So if somebody wants to fix get_dirent_type() so that it works properly on Windows, we can change all these places together. I also change the use of allow_invalid_pages to allow_in_place_tablespaces. We could add a separate GUC for this, but it seems overengineering. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Most hackers will be perfectly comfortable conceptualizing users as entropy sources, so let's move on." (Nathaniel Smith) >From 26a0be53592a20aa09501e9015f77a4b3c3c3302 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Wed, 13 Jul 2022 18:14:18 +0200 Subject: [PATCH v26] Fix replay of create database records on standby Crash recovery on standby may encounter missing directories when replaying create database WAL records. Prior to this patch, the standby would fail to recover in such a case. However, the directories could be legitimately missing. Consider a sequence of WAL records as follows: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, the crash recovery must be able to move on. This patch allows missing tablespaces to be created during recovery before reaching consistency. The tablespaces are created as real directories that should not exists but will be removed until reaching consistency. CheckRecoveryConsistency is responsible to make sure they have disappeared. The problems detected by this new code are reported as PANIC, except when allow_in_place_tablespaces is set to ON, in which case they are WARNING. Apart from making tests possible, this gives users an escape hatch in case things don't go as planned. Diagnosed-by: Paul Guo Author: Paul Guo Author: Kyotaro Horiguchi Author: Asim R Praveen Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27dn5xwj2p9-roh16e4...@mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 54 +++ src/backend/commands/dbcommands.c | 77 ++ src/backend/commands/tablespace.c | 28 +--- src/test/recovery/t/033_replay_tsp_drops.pl | 155 4 files changed, 287 insertions(+), 27 deletions(-) create mode 100644 src/test/recovery/t/033_replay_tsp_drops.pl diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 5d6f1b5e46..850ab6d7e6 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -42,6 +42,7 @@ #include "access/xlogutils.h" #include "catalog/pg_control.h" #include "commands/tablespace.h" +#include "common/file_utils.h" #include "miscadmin.h" #include "pgstat.h" #include "postmaster/bgwriter.h" @@ -2008,6 +2009,51 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI) } } +/* + * Verify that, in non-test mode, ./pg_tblspc doesn't contain any real + * directories. + * + * Replay of database creation XLOG records for databases that were later + * dropped can create fake directories in pg_tblspc. By the time consistency + * is reached these directories should have been removed; here we verify + * that this did indeed happen. This is to be called at the point where + * consistent state is reached. + * + * allow_in_place_tablespaces turns the PANIC into a WARNING, which is + * useful for testing purposes, and also allows for an escape hatch in case + * things go south. + */ +static void +CheckTablespaceDirectory(void) +{ + DIR *dir; + struct dirent *de; + + dir = AllocateDir("pg_tblspc"); + while ((de = ReadDir(dir, "pg_tblspc")) != NULL) + { + char path[MAXPGPATH + 10]; + + /* Skip entries of non-oid names */ + if (strspn(de->d_name, "0123456789") != strlen(de->d_name)) + continue; + + snprintf(path, sizeof(path), "pg_tblspc/%s", de->d_name); + +#ifdef WIN32 + if (!pgwin32_is_junction(path)) +#else + if (get_dirent_type(path, de, false, ERROR) != PGFILETYPE_LNK) +#endif + ereport(allow_in_place_tablespaces ? WARNING : PANIC, + (errcode(ERRCODE_DATA_CORRUPTED), + errmsg("unexpected directory entry \"%s\" found in %s", + de->d_name, "pg_tblspc/"), + errdetail("All directory entries in pg_tblspc/ should be symbolic links."), + errhint("Remove those directories, or set allow_in_place_tablespaces to ON transiently to let recovery complete."))); + } +} + /* * Checks if recovery has reached a consistent state. When consistency is * reached and we have
Re: Windows default locale vs initdb
On Tue, Jul 19, 2022 at 4:47 AM Thomas Munro wrote: > As for whether "accordingly" still applies, by the logic of of > win32_langinfo()... Windows still considers WIN1252 to be the default > ANSI code page for "en-US", though it'd work with UTF-8 too. I'm not > sure what to make of that. The goal here was to give Windows users > good defaults, but WIN1252 is probably not what most people actually > want. Hmph. > Still, WIN1252 is not the wrong answer for what we are asking. Even if you enable UTF-8 support [1], the system will use the current default Windows ANSI code page (ACP) for the locale and UTF-8 for the code page. [1] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 Regards, Juan José Santamaría Flecha
Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?
On Wed, Jul 20, 2022 at 11:03 AM David Rowley wrote: > I think we can relax this now that we have incremental sort. I think > a better way to limit this is to allow a prefix of the query_pathkeys > providing that covers *all* of the join pathkeys. That way, for the > above query, it leaves it open for the planner to do the Merge Join by > sorting by a.a DESC then just do an Incremental Sort to get the > GroupAggregate input sorted by a.b. So the idea is if the ECs used by the mergeclauses are prefix of the query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why not relax this further that if the ECs in the mergeclauses and the query_pathkeys have common prefix, we use that prefix as pathkeys? So that we can have a plan like below: # explain (costs off) select * from t1 join t2 on t1.c = t2.c and t1.a = t2.a order by t1.a DESC, t1.b; QUERY PLAN --- Incremental Sort Sort Key: t1.a DESC, t1.b Presorted Key: t1.a -> Merge Join Merge Cond: ((t1.a = t2.a) AND (t1.c = t2.c)) -> Sort Sort Key: t1.a DESC, t1.c -> Seq Scan on t1 -> Sort Sort Key: t2.a DESC, t2.c -> Seq Scan on t2 (11 rows) Thanks Richard
Re: Pluggable toaster
Hi hackers! We really need your feedback on the last patchset update! On a previous question about TOAST API overhead - please check script in attach, we tested INSERT, UPDATE and SELECT operations, and ran it on vanilla master and on patched master (vanilla with untouched TOAST implementation and patched with default TOAST implemented via TOAST API, in this patch set - with patches up to 0005_bytea_appendable_toaster installed). Some of the test scripts will be included in the patch set later, as an additional patch. Currently I'm working on an update to the default Toaster (some internal optimizations, not affecting functionality) and readme files explaining Pluggable TOAST. An example of using custom Toaster: Custom Toaster extension definition (developer): CREATE FUNCTION custom_toaster_handler(internal) RETURNS toaster_handler AS 'MODULE_PATHNAME' LANGUAGE C; CREATE TOASTER custom_toaster HANDLER custom_toaster_handler; User's POV: CREATE EXTENSION custom_toaster; select * from pg_toaster; oid |tsrname | tsrhandler ---++- 9864 | deftoaster | default_toaster_handler 32772 | custom_toaster | custom_toaster_handler CREATE TABLE tst1 ( c1 text STORAGE plain, c2 text STORAGE external TOASTER custom_toaster, id int4 ); ALTER TABLE tst1 ALTER COLUMN c1 SET TOASTER custom_toaster; =# \d+ tst1 Column | Type | Collation | Nullable | Default | Storage | Toaster |... +-+---+--+-+--++... c1 | text| | | | plain| deftoaster |... c2 | text| | | | external | custom_toaster |... id | integer | | | | plain| |... Access method: heap Thanks! Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/ On Wed, Jul 13, 2022 at 10:45 PM Nikita Malakhov wrote: > Hi hackers! > According to previous requests the patch branch was cleaned up from > garbage, logs, etc. All conflicts' resolutions were merged > into patch commits where they appear, branch was rebased to present one > commit for one patch. The branch was actualized, > and a fresh patch set was generated. > https://github.com/postgrespro/postgres/tree/toasterapi_clean > > What we propose in short: > We suggest a way to make TOAST pluggable as Storage (in a way like > Pluggable Access Methods) - detached TOAST > mechanics from Heap AM, and made it an independent pluggable and > extensible part with our freshly developed TOAST API. > With this patch set you will be able to develop and plug in your own TOAST > mechanics > for table columns. Knowing internals > and/or workflow and workload of data being TOASTed makes Custom Toasters much > more efficient in performance and storage. > We keep backwards compatibility and default TOAST mechanics works as it > worked previously, working silently with any > Toastable datatype > (and TOASTed values and tables from previous versions, no changes in this) > and set as default Toaster is not stated otherwise, > but through our TOAST API. > > We've already presented out work at HighLoad, PgCon and PgConf > conferences, you can find materials here > http://www.sai.msu.su/~megera/postgres/talks/ > Testing scripts used in talks are a bit scarce and have a lot of > manual handling, so it is another bit of work to bunch them into > patch set, please be patient, I'll try to make it ASAP. > > We have ready to plug in extension Toasters > - bytea appendable toaster for bytea datatype (impressive speedup with > bytea append operation) is included in this patch set; > - JSONB toaster for JSONB (very cool performance improvements when > dealing with TOASTed JSONB) will be provided later; > - Prototype Toasters (in development) for PostGIS (much faster then > default with geometric data), large binary objects > (like pg_largeobject, but much, much larger, and without existing large > object limitations), and currently we're checking default > Toaster implementation without using Indexes (direct access by TIDs, up > to 3 times faster than default on smaller values, > less storage due to absence of index tree). > > Patch set consists of 8 incremental patches: > 0001_create_table_storage_v5.patch - SQL syntax fix for CREATE TABLE > clause, processing SET STORAGE... correctly; > This patch is already discussed in a separate thread; > > 0002_toaster_interface_v8.patch - TOAST API interface and SQL syntax > allowing creation of custom Toaster (CREATE TOASTER ...) > and setting Toaster to a table column (CREATE TABLE t (data bytea STORAGE > EXTERNAL TOASTER bytea_toaster);) > > 0003_toaster_default_v7.patch - Default TOAST implemented via TOAST API; > > 0004_toaster_snapshot_v7.patch - refactoring of Default TOAST and support > for versioned Toast rows; > > 0005_bytea_appendable_toaster_v7.patch - contrib module > bytea_appendable_toaster - special Toaster
Re: Handle infinite recursion in logical replication setup
On Wed, Jul 20, 2022 at 10:38 AM Peter Smith wrote: > > On Tue, Jul 19, 2022 at 11:34 PM Amit Kapila wrote: > > > > On Mon, Jul 18, 2022 at 9:46 PM vignesh C wrote: > > > > > > I have updated the patch to handle the origin value case > > > insensitively. The attached patch has the changes for the same. > > > > > > > Thanks, the patch looks mostly good to me. I have made a few changes > > in 0001 patch which are as follows: (a) make a comparison of origin > > names in maybe_reread_subscription similar to slot names as in future > > we may support origin names other than 'any' and 'none', (b) made > > comment changes at few places and minor change in one of the error > > message, (c) ran pgindent and slightly changed the commit message. > > > > I am planning to push this day after tomorrow unless there are any > > comments/suggestions. > > FYI, the function name in the comment is not same as the function name here: > > +/* > + * IsReservedName > + * True iff name is either "none" or "any". > + */ > +static bool > +IsReservedOriginName(const char *name) Modified. Apart from this I have run pgperltidy on the perl file and renamed 032_origin.pl to 030_origin.pl as currently there is 029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file. Thanks for the comment, the attached patch has the changes for the same. Regards, Vignesh From 69f9b55aa28a0325e0d85c8f8bed407c7218e813 Mon Sep 17 00:00:00 2001 From: Vigneshwaran C Date: Sat, 9 Jul 2022 13:20:34 +0530 Subject: [PATCH v36] Allow users to skip logical replication of data having origin. This patch adds a new SUBSCRIPTION parameter "origin". It specifies whether the subscription will request the publisher to only send changes that don't have an origin or send changes regardless of origin. Setting it to "none" means that the subscription will request the publisher to only send changes that have no origin associated. Setting it to "any" means that the publisher sends changes regardless of their origin. The default is "any". Usage: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=' PUBLICATION pub1 WITH (origin = none); This can be used to avoid loops (infinite replication of the same data) among replication nodes. This feature allows filtering only the replication data originating from WAL but for initial sync (initial copy of table data) we don't have such a facility as we can only distinguish the data based on origin from WAL. As a follow-up patch, we are planning to forbid the initial sync if we notice that the publication tables were also replicated from other publishers to avoid duplicate data or loops. Author: Vignesh C, Amit Kapila Reviewed-By: Peter Smith, Amit Kapila, Dilip Kumar, Ashutosh Bapat, Hayato Kuroda, Shi yu, Alvaro Herrera Discussion: https://postgr.es/m/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubh...@mail.gmail.com --- contrib/test_decoding/expected/replorigin.out | 10 ++ contrib/test_decoding/sql/replorigin.sql | 5 + doc/src/sgml/catalogs.sgml| 14 ++ doc/src/sgml/ref/alter_subscription.sgml | 5 +- doc/src/sgml/ref/create_subscription.sgml | 15 ++ src/backend/catalog/pg_subscription.c | 8 + src/backend/catalog/system_views.sql | 4 +- src/backend/commands/subscriptioncmds.c | 43 - .../libpqwalreceiver/libpqwalreceiver.c | 5 + src/backend/replication/logical/origin.c | 22 ++- src/backend/replication/logical/worker.c | 2 + src/backend/replication/pgoutput/pgoutput.c | 27 ++- src/bin/pg_dump/pg_dump.c | 15 +- src/bin/pg_dump/pg_dump.h | 1 + src/bin/pg_dump/t/002_pg_dump.pl | 22 +++ src/bin/psql/describe.c | 7 +- src/bin/psql/tab-complete.c | 7 +- src/include/catalog/pg_subscription.h | 17 ++ src/include/replication/pgoutput.h| 1 + src/include/replication/walreceiver.h | 2 + src/test/regress/expected/subscription.out| 142 +--- src/test/regress/sql/subscription.sql | 10 ++ src/test/subscription/t/030_origin.pl | 155 ++ 23 files changed, 462 insertions(+), 77 deletions(-) create mode 100644 src/test/subscription/t/030_origin.pl diff --git a/contrib/test_decoding/expected/replorigin.out b/contrib/test_decoding/expected/replorigin.out index 2e9ef7c823..49ffaeea2d 100644 --- a/contrib/test_decoding/expected/replorigin.out +++ b/contrib/test_decoding/expected/replorigin.out @@ -56,6 +56,16 @@ SELECT pg_replication_origin_drop('regress_test_decoding: temp'); SELECT pg_replication_origin_drop('regress_test_decoding: temp'); ERROR: replication origin "regress_test_decoding: temp" does not exist +-- specifying reserved origin names is not supported +SELECT pg_replication_origin_create('any'); +ERROR: replication origin name "any" is reserved +DETAIL: Origin names "any", "none", and names starting with "pg_"
Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.
On Thu, Jul 14, 2022 at 4:34 AM Peter Smith wrote: > > On Wed, Jul 13, 2022 at 7:55 PM vignesh C wrote: > > > > On Wed, Jul 13, 2022 at 1:13 PM Michael Paquier wrote: > > > > > > On Wed, Jul 13, 2022 at 12:22:06PM +0530, vignesh C wrote: > > > > Most of the code is common between GetSubscriptionRelations and > > > > GetSubscriptionNotReadyRelations. Added a parameter to > > > > GetSubscriptionRelations which could provide the same functionality as > > > > the existing GetSubscriptionRelations and > > > > GetSubscriptionNotReadyRelations. Attached patch has the changes for > > > > the same. Thoughts? > > > > > > Right. Using all_rels to mean that we'd filter relations that are not > > > ready is a bit confusing, though. Perhaps this could use a bitmask as > > > argument. > > > > The attached v2 patch has the modified version which includes the > > changes to make the argument as bitmask. > > > > By using a bitmask I think there is an implication that the flags can > be combined... > > Perhaps it is not a problem today, but later you may want more flags. e.g. > #define SUBSCRIPTION_REL_STATE_READY 0x02 /* READY relations */ > > then the bitmask idea falls apart because IIUC you have no intentions > to permit things like: > (SUBSCRIPTION_REL_STATE_NOT_READY | SUBSCRIPTION_REL_STATE_READY) > I think this will be an invalid combination if caller ever used it. However, one might need to use a combination like (SUBSCRIPTION_REL_STATE_READY | SUBSCRIPTION_REL_STATE_DONE). For such cases, I feel the bitmask idea will be better. -- With Regards, Amit Kapila.
Re: pgsql: Default to hidden visibility for extension libraries where possi
Hello On 2022-Jul-19, Andres Freund wrote: > On 2022-07-19 16:28:07 +0200, Alvaro Herrera wrote: > > Anyway, the minimal patch that makes plpgsql_check tests pass is > > attached. > > Do you want to commit that or should I? Done. No immediate plans for splitting plpgsql.h, so if anyone wants to take a stab at that, be my guest. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Aprender sin pensar es inútil; pensar sin aprender, peligroso" (Confucio)
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada wrote: > > On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila wrote: > > > > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada > > wrote: > > > > > Another idea would be to have functions, say > > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter > > > does actual work of handling transaction commits and both > > > SnapBuildCommitTxn() and SnapBuildCommit() call > > > SnapBuildCommitTxnWithXInfo() with different arguments. > > > > > > > Do you want to say DecodeCommit() instead of SnapBuildCommit() in > > above para? > > I meant that we will call like DecodeCommit() -> > SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals = > true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If > SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext() > with has_invals = false and behaves the same as before. > Okay, understood. This will work. > > Yet another idea could be to have another flag > > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN. > > Then, we can retrieve it even for each of the subtxn's if and when > > required. > > Do you mean that when checking if the subtransaction has catalog > changes, we check if its top-level XID has this new flag? > Yes. > Why do we > need the new flag? > This is required if we don't want to introduce a new set of functions as you proposed above. I am not sure which one is better w.r.t back patching effort later but it seems to me using flag stuff would make future back patches easier if we make any changes in SnapBuildCommitTxn. -- With Regards, Amit Kapila.
Re: i.e. and e.g.
At Thu, 14 Jul 2022 15:38:37 +0900 (JST), Kyotaro Horiguchi wrote in > At Thu, 14 Jul 2022 09:40:25 +0700, John Naylor > wrote in > > On Wed, Jul 13, 2022 at 4:13 PM Kyotaro Horiguchi > > wrote: > > > > > > At Wed, 13 Jul 2022 18:09:43 +0900 (JST), Kyotaro Horiguchi < > > horikyota@gmail.com> wrote in > > > > So, "e.g." (for example) in the message sounds like "that is", which I > > > > think is "i.e.". It should be fixed if this is correct. I'm not sure > > > > whether to keep using Latin-origin acronyms like this, but in the > > > > attached I used "i.e.". > > > > I did my own quick scan and found one use of i.e. that doesn't really fit, > > in a sentence that has other grammatical issues: > > > > -Due to the differences how ECPG works compared to Informix's > > ESQL/C (i.e., which steps > > +Due to differences in how ECPG works compared to Informix's ESQL/C > > (namely, which steps > > are purely grammar transformations and which steps rely on the > > Oh! > > > I've pushed that in addition to your changes, thanks! > > Thanks! By the way, I forgot about back-branches. Don't we need to fix the same in back-branches? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Windows default locale vs initdb
On Tue, Jul 19, 2022 at 12:59 AM Thomas Munro wrote: > Now that museum-grade Windows has been defenestrated, we are free to > call GetUserDefaultLocaleName(). Here's a patch. > This LGTM. > > I think we should also convert to POSIX format when making the > collname in your pg_import_system_collations() proposal, so that > COLLATE "en_US" works (= a SQL identifier), but that's another > thread[1]. I don't think we should do it in collcollate or > datcollate, which is a string for the OS to interpret. > That thread has been split [1], but that is how the current version behaves. > > With my garbage collector hat on, I would like to rip out all of the > support for traditional locale names, eventually. Deleting kludgy > code is easy and fun -- 0002 is a first swing at that -- but there > remains an important unanswered question. How should someone > pg_upgrade a "English_Canada.1521" cluster if we now reject that name? > We'd need to do a conversion to "en-CA", or somehow tell the user to. > H. > Is there a safe way to do that in pg_upgrade or would we be forcing users to pg_dump into the new cluster? [1] https://www.postgresql.org/message-id/flat/0050ec23-34d9-2765-9015-98c04f0e18ac%40postgrespro.ru Regards, Juan José Santamaría Flecha
Re: Memory leak fix in psql
On Wed, 20 Jul 2022 at 14:21, tanghy.f...@fujitsu.com wrote: > On Wednesday, July 20, 2022 12:52 PM, Michael Paquier > wrote: >> What about the argument of upthread where we could use a goto in >> functions where there are multiple pattern validation checks? Per se >> v4 attached. > > Thanks for your kindly remind and modification. > I checked v4 patch, it looks good but I think there can be some minor > improvement. > So I deleted some redundant braces around "goto error_return; ". > Also added an error handle section in validateSQLNamePattern. > > Kindly to have a check at the attached v5 patch. > > Regards, > Tang Thanks for updating the patch. It looks good. However, it cannot be applied on 14 stable. The attached patches are for 10-14. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. >From 1dd828a40d5a45c0ee021f42f8eee354082a72cf Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 20 Jul 2022 12:47:52 +0900 Subject: [PATCH v5] fix the memory leak in psql describe diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 88d92a08ae..77b0f87e39 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -112,7 +112,10 @@ describeAggregates(const char *pattern, bool verbose, bool showSystem) "n.nspname", "p.proname", NULL, "pg_catalog.pg_function_is_visible(p.oid)", NULL, 3)) + { + termPQExpBuffer(); return false; + } appendPQExpBufferStr(, "ORDER BY 1, 2, 4;"); @@ -182,7 +185,10 @@ describeAccessMethods(const char *pattern, bool verbose) NULL, "amname", NULL, NULL, NULL, 1)) + { + termPQExpBuffer(); return false; + } appendPQExpBufferStr(, "ORDER BY 1;"); @@ -244,7 +250,10 @@ describeTablespaces(const char *pattern, bool verbose) NULL, "spcname", NULL, NULL, NULL, 1)) + { + termPQExpBuffer(); return false; + } appendPQExpBufferStr(, "ORDER BY 1;"); @@ -534,7 +543,7 @@ describeFunctions(const char *functypes, const char *func_pattern, "n.nspname", "p.proname", NULL, "pg_catalog.pg_function_is_visible(p.oid)", NULL, 3)) - return false; + goto error_return; for (int i = 0; i < num_arg_patterns; i++) { @@ -561,7 +570,7 @@ describeFunctions(const char *functypes, const char *func_pattern, true, false, nspname, typname, ft, tiv, NULL, 3)) -return false; +goto error_return; } else { @@ -599,6 +608,10 @@ describeFunctions(const char *functypes, const char *func_pattern, PQclear(res); return true; + +error_return: + termPQExpBuffer(); + return false; } @@ -682,7 +695,10 @@ describeTypes(const char *pattern, bool verbose, bool showSystem) "pg_catalog.format_type(t.oid, NULL)", "pg_catalog.pg_type_is_visible(t.oid)", NULL, 3)) + { + termPQExpBuffer(); return false; + } appendPQExpBufferStr(, "ORDER BY 1, 2;"); @@ -836,7 +852,7 @@ describeOperators(const char *oper_pattern, "n.nspname", "o.oprname", NULL, "pg_catalog.pg_operator_is_visible(o.oid)", NULL, 3)) - return false; + goto error_return; if (num_arg_patterns == 1) appendPQExpBufferStr(, " AND o.oprleft = 0\n"); @@ -866,7 +882,7 @@ describeOperators(const char *oper_pattern, true, false, nspname, typname, ft, tiv, NULL, 3)) -return false; +goto error_return; } else { @@ -890,6 +906,10 @@ describeOperators(const char *oper_pattern, PQclear(res); return true; + +error_return: + termPQExpBuffer(); + return false; } @@ -953,7 +973,10 @@ listAllDbs(const char *pattern, bool verbose) if (!validateSQLNamePattern(, pattern, false, false, NULL, "d.datname", NULL, NULL, NULL, 1)) + { + termPQExpBuffer(); return false; + } appendPQExpBufferStr(, "ORDER BY 1;"); res = PSQLexec(buf.data); @@ -1106,7 +1129,10 @@ permissionsList(const char *pattern) "n.nspname", "c.relname", NULL, "n.nspname !~ '^pg_' AND pg_catalog.pg_table_is_visible(c.oid)", NULL, 3)) + { + termPQExpBuffer(); return false; + } appendPQExpBufferStr(, "ORDER BY 1, 2;"); @@ -1177,16 +1203,13 @@ listDefaultACLs(const char *pattern) "pg_catalog.pg_get_userbyid(d.defaclrole)", NULL, NULL, 3)) - return false; + goto error_return; appendPQExpBufferStr(, "ORDER BY 1, 2, 3;"); res = PSQLexec(buf.data); if (!res) - { - termPQExpBuffer(); - return false; - } + goto error_return; myopt.nullPrint = NULL; printfPQExpBuffer(, _("Default access privileges")); @@ -1200,6 +1223,10 @@ listDefaultACLs(const char *pattern) termPQExpBuffer(); PQclear(res); return true; + +error_return: + termPQExpBuffer(); + return false; } @@ -1253,7 +1280,7 @@ objectDescription(const char *pattern, bool showSystem) false, "n.nspname", "pgc.conname", NULL, "pg_catalog.pg_table_is_visible(c.oid)",
Re: Fast COPY FROM based on batch insert
On Tue, Jul 19, 2022 at 6:35 PM Andrey Lepikhov wrote: > On 18/7/2022 13:22, Etsuro Fujita wrote: > > I rewrote the decision logic to something much simpler and much less > > invasive, which reduces the patch size significantly. Attached is an > > updated patch. What do you think about that? > maybe you forgot this code: > /* > * If a partition's root parent isn't allowed to use it, neither is the > * partition. > */ > if (rootResultRelInfo->ri_usesMultiInsert) > leaf_part_rri->ri_usesMultiInsert = > ExecMultiInsertAllowed(leaf_part_rri); I think the patch accounts for that. Consider this bit to determine whether to use batching for the partition chosen by ExecFindPartition(): @@ -910,12 +962,14 @@ CopyFrom(CopyFromState cstate) /* * Disable multi-inserts when the partition has BEFORE/INSTEAD -* OF triggers, or if the partition is a foreign partition. +* OF triggers, or if the partition is a foreign partition +* that can't use batching. */ leafpart_use_multi_insert = insertMethod == CIM_MULTI_CONDITION\ AL && !has_before_insert_row_trig && !has_instead_insert_row_trig && - resultRelInfo->ri_FdwRoutine == NULL; + (resultRelInfo->ri_FdwRoutine == NULL || +resultRelInfo->ri_BatchSize > 1); If the root parent isn't allowed to use batching, then we have insertMethod=CIM_SINGLE for the parent before we get here. So in that case we have leafpart_use_multi_insert=false for the chosen partition, meaning that the partition isn't allowed to use batching, either. (The patch just extends the existing decision logic to the foreign-partition case.) > Also, maybe to describe in documentation, if the value of batch_size is > more than 1, the ExecForeignBatchInsert routine have a chance to be called? Yeah, but I think that is the existing behavior, and that the patch doesn't change the behavior, so I would leave that for another patch. Thanks for reviewing! Best regards, Etsuro Fujita
Re: Memory leak fix in psql
More on the same tune. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "This is what I like so much about PostgreSQL. Most of the surprises are of the "oh wow! That's cool" Not the "oh shit!" kind. :)" Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php >From 5e031fadc778365491f9e58da83a4b64f21e3bcd Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Wed, 20 Jul 2022 10:04:27 +0200 Subject: [PATCH] More goto error_return in describe.c --- src/bin/psql/describe.c | 33 + 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 77b0f87e39..46999f26f2 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -1129,19 +1129,13 @@ permissionsList(const char *pattern) "n.nspname", "c.relname", NULL, "n.nspname !~ '^pg_' AND pg_catalog.pg_table_is_visible(c.oid)", NULL, 3)) - { - termPQExpBuffer(); - return false; - } + goto error_return; appendPQExpBufferStr(, "ORDER BY 1, 2;"); res = PSQLexec(buf.data); if (!res) - { - termPQExpBuffer(); - return false; - } + goto error_return; myopt.nullPrint = NULL; printfPQExpBuffer(, _("Access privileges")); @@ -1155,6 +1149,10 @@ permissionsList(const char *pattern) termPQExpBuffer(); PQclear(res); return true; + +error_return: + termPQExpBuffer(); + return false; } @@ -4983,19 +4981,13 @@ listSchemas(const char *pattern, bool verbose, bool showSystem) NULL, "n.nspname", NULL, NULL, NULL, 2)) - { - termPQExpBuffer(); - return false; - } + goto error_return; appendPQExpBufferStr(, "ORDER BY 1;"); res = PSQLexec(buf.data); if (!res) - { - termPQExpBuffer(); - return false; - } + goto error_return; myopt.nullPrint = NULL; myopt.title = _("List of schemas"); @@ -5016,10 +5008,7 @@ listSchemas(const char *pattern, bool verbose, bool showSystem) pattern); result = PSQLexec(buf.data); if (!result) - { - termPQExpBuffer(); - return false; - } + goto error_return; else pub_schema_tuples = PQntuples(result); @@ -5066,6 +5055,10 @@ listSchemas(const char *pattern, bool verbose, bool showSystem) } return true; + +error_return: + termPQExpBuffer(); + return false; } -- 2.30.2
Re: Memory leak fix in psql
Thanks for your explanation, this time I know how it works, thanks ;) On Wed, Jul 20, 2022 at 3:04 PM tanghy.f...@fujitsu.com wrote: > > > On Wed, Jul 20, 2022 at 12:51:24PM +0800, Junwang Zhao wrote: > > > Though the patch looks good, I myself think the patch should be edited > > > and submitted by Tang > > > It's easy to attach a fixed patch based on the comments of the thread, > > > but coins should be > > > given to Tang since he is the first one to find the mem leak. > > Hello, Zhao > > Thanks for your check at this patch. > > I appreciate your kindly comment but there may be a misunderstanding here. > As Michael explained, committers in Postgres will review carefully and > help to improve contributors' patches. When the patch is finally committed > by one committer, from what I can see, he or she will try to make sure the > credit goes with everyone who contributed to the committed patch(such as > bug reporter, patch author, tester, reviewer etc.). > > Also, developers and reviewers will try to help improving our proposed patch > by rebasing it or adding an on-top patch(like Japin Li did in v2). > These will make the patch better and to be committed ASAP. > > Good to see you at Postgres community. > > Regards, > Tang -- Regards Junwang Zhao
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila wrote: > > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada wrote: > > > > On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila > > wrote: > > > > > > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada > > > wrote: > > > > > > Why do you think we can't remove > > > ReorderBufferInitialXactsSetCatalogChanges() from the back branch > > > patch? I think we don't need to change the existing function > > > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper > > > like SnapBuildXidHasCatalogChanges() similar to master branch patch. > > > > IIUC we need to change SnapBuildCommitTxn() but it's exposed. > > > > Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() -> > > ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we > > call like DecodeCommit() -> SnapBuildCommitTxn() -> > > SnapBuildXidHasCatalogChanges() -> > > ReorderBufferXidHasCatalogChanges(). In > > SnapBuildXidHasCatalogChanges(), we need to check if the transaction > > has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass > > either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0) > > down to SnapBuildXidHasCatalogChanges(). However, since > > SnapBuildCommitTxn(), between DecodeCommit() and > > SnapBuildXidHasCatalogChanges(), is exposed we cannot change it. > > > > Agreed. > > > Another idea would be to have functions, say > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter > > does actual work of handling transaction commits and both > > SnapBuildCommitTxn() and SnapBuildCommit() call > > SnapBuildCommitTxnWithXInfo() with different arguments. > > > > Do you want to say DecodeCommit() instead of SnapBuildCommit() in > above para? I meant that we will call like DecodeCommit() -> SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals = true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext() with has_invals = false and behaves the same as before. > Yet another idea could be to have another flag > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN. > Then, we can retrieve it even for each of the subtxn's if and when > required. Do you mean that when checking if the subtransaction has catalog changes, we check if its top-level XID has this new flag? Why do we need the new flag? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages
At Wed, 20 Jul 2022 10:02:22 +0530, Bharath Rupireddy wrote in > Done. PSA v6 patch set. Thanks! -Specifies the minimum size of past log file segments kept in the +Specifies the minimum size of past WAL files kept in the -log file by reducing the value of this parameter. On a system with +file by reducing the value of this parameter. On a system with Looks fine. And postgresq.conf.sample has the following lines: #archive_library = '' # library to use to archive a logfile segment #archive_command = '' # command to use to archive a logfile segment #archive_timeout = 0# force a logfile segment switch after this #restore_command = '' # command to use to restore an archived logfile segment Aren't they need the same fix? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 4:16 PM Kyotaro Horiguchi wrote: > > At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada > wrote in > > On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi > > wrote: > > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always > > > allocate the array with a fixed size. SnapBuildAddCommittedTxn still > > > assumes builder->committed.xip is non-NULL. SnapBuildRestore *kept* > > > ondisk.builder.commited.xip populated with a valid array pointer. But > > > the patch allows committed.xip be NULL, thus in that case, > > > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion > > > failure. > > > > IIUC the patch doesn't allow committed.xip to be NULL since we don't > > overwrite it if builder->committed.xcnt is 0 (i.e., > > ondisk.builder.committed.xip is NULL): > > I meant that ondisk.builder.committed.xip can be NULL.. But looking > again that cannot be. I don't understand what I was looking at at > that time. > > So, sorry for the noise. No problem. Thank you for your review and comments! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada wrote in > On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi > wrote: > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always > > allocate the array with a fixed size. SnapBuildAddCommittedTxn still > > assumes builder->committed.xip is non-NULL. SnapBuildRestore *kept* > > ondisk.builder.commited.xip populated with a valid array pointer. But > > the patch allows committed.xip be NULL, thus in that case, > > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion > > failure. > > IIUC the patch doesn't allow committed.xip to be NULL since we don't > overwrite it if builder->committed.xcnt is 0 (i.e., > ondisk.builder.committed.xip is NULL): I meant that ondisk.builder.committed.xip can be NULL.. But looking again that cannot be. I don't understand what I was looking at at that time. So, sorry for the noise. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
On Mon, Jul 18, 2022 at 8:29 AM Amit Kapila wrote: > IIUC, this proposal is to optimize cases where users can't have a > unique/primary key for a relation on the subscriber and those > relations receive lots of updates or deletes? I think this patch optimizes for all non-trivial cases of update/delete replication (e.g. >1000 rows in the table, >1000 rows per hour updated) without a primary key. For instance, it's quite common to have a large append-mostly events table without a primary key (e.g. because of partitioning, or insertion speed), which will still have occasional batch updates/deletes. Imagine an update of a table or partition with 1 million rows and a typical scan speed of 1M rows/sec. An update on the whole table takes maybe 1-2 seconds. Replicating the update using a sequential scan per row can take on the order of ~12 days ≈ 1M seconds. The current implementation makes using REPLICA IDENTITY FULL a huge liability/ impractical for scenarios where you want to replicate an arbitrary set of user-defined tables, such as upgrades, migrations, shard moves. We generally recommend users to tolerate update/delete errors in such scenarios. If the apply worker can use an index, the data migration tool can tactically create one on a high cardinality column, which would practically always be better than doing a sequential scan for non-trivial workloads. cheers, Marco
RE: Memory leak fix in psql
> On Wed, Jul 20, 2022 at 12:51:24PM +0800, Junwang Zhao wrote: > > Though the patch looks good, I myself think the patch should be edited > > and submitted by Tang > > It's easy to attach a fixed patch based on the comments of the thread, > > but coins should be > > given to Tang since he is the first one to find the mem leak. Hello, Zhao Thanks for your check at this patch. I appreciate your kindly comment but there may be a misunderstanding here. As Michael explained, committers in Postgres will review carefully and help to improve contributors' patches. When the patch is finally committed by one committer, from what I can see, he or she will try to make sure the credit goes with everyone who contributed to the committed patch(such as bug reporter, patch author, tester, reviewer etc.). Also, developers and reviewers will try to help improving our proposed patch by rebasing it or adding an on-top patch(like Japin Li did in v2). These will make the patch better and to be committed ASAP. Good to see you at Postgres community. Regards, Tang
Re: replacing role-level NOINHERIT with a grant-level option
On 7/19/22 12:56 AM, Robert Haas wrote: Another good catch. Here is v5 with a fix for that problem. Thanks, the issue is fixed now. -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
Re: Backup command and functions can cause assertion failure and segmentation fault
On Wed, Jul 20, 2022 at 02:00:00PM +0900, Fujii Masao wrote: > I reported two trouble cases; they are the cases where BASE_BACKUP > is canceled and terminated, respectively. But you added the test > only for one of them. Is this intentional? Nope. The one I have implemented was the fanciest case among the two, so I just focused on it. Adding an extra test to cover the second scenario is easier. So I have added one as of the attached, addressing your other comments while on it. I have also decided to add the tests at the bottom of 001_stream_rep.pl, as these are quicker than a node initialization. -- Michael From 2aa841a3dfb643a14d28e6d595703f14e98ad919 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 20 Jul 2022 15:48:27 +0900 Subject: [PATCH v2] Add more TAP tests with BASE_BACKUP --- src/test/recovery/t/001_stream_rep.pl | 55 +++ 1 file changed, 55 insertions(+) diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl index 86864098f9..b15dd6b29a 100644 --- a/src/test/recovery/t/001_stream_rep.pl +++ b/src/test/recovery/t/001_stream_rep.pl @@ -531,4 +531,59 @@ my $primary_data = $node_primary->data_dir; ok(!-f "$primary_data/pg_wal/$segment_removed", "WAL segment $segment_removed recycled after physical slot advancing"); +note "testing pg_backup_start() followed by BASE_BACKUP"; +my $connstr = $node_primary->connstr('postgres') . " replication=database"; + +# This test requires a replication connection with a database, as it mixes +# a replication command and a SQL command. +$node_primary->command_fails_like( + [ + 'psql', '-c', "SELECT pg_backup_start('backup', true)", + '-c', 'BASE_BACKUP', '-d', $connstr + ], + qr/a backup is already in progress in this session/, + 'BASE_BACKUP cannot run in session already running backup'); + +note "testing BASE_BACKUP cancellation"; + +my $sigchld_bb_timeout = + IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default); + +# This test requires a replication connection with a database, as it mixes +# a replication command and a SQL command. The first BASE_BACKUP is throttled +# to give enough room for the cancellation running below. The second command +# for pg_backup_stop() should fail. +my ($sigchld_bb_stdin, $sigchld_bb_stdout, $sigchld_bb_stderr) = ('', '', ''); +my $sigchld_bb = IPC::Run::start( + [ + 'psql', '-X', '-c', "BASE_BACKUP (CHECKPOINT 'fast', MAX_RATE 32);", + '-c', 'SELECT pg_backup_stop()', + '-d', $connstr + ], + '<', + \$sigchld_bb_stdin, + '>', + \$sigchld_bb_stdout, + '2>', + \$sigchld_bb_stderr, + $sigchld_bb_timeout); + +# The cancellation is issued once the database files are streamed and +# the checkpoint issued at backup start completes. +is( $node_primary->poll_query_until( + 'postgres', + "SELECT pg_cancel_backend(a.pid) FROM " + . "pg_stat_activity a, pg_stat_progress_basebackup b WHERE " + . "a.pid = b.pid AND a.query ~ 'BASE_BACKUP' AND " + . "b.phase = 'streaming database files';"), + "1", + "WAL sender sending base backup killed"); + +# The psql command should fail on pg_backup_stop(). +ok( pump_until( + $sigchld_bb, $sigchld_bb_timeout, + \$sigchld_bb_stderr, qr/backup is not in progress/), + 'base backup cleanly cancelled'); +$sigchld_bb->finish(); + done_testing(); -- 2.36.1 signature.asc Description: PGP signature
Re: Allow placeholders in ALTER ROLE w/o superuser
At Tue, 19 Jul 2022 09:53:39 -0700, Nathan Bossart wrote in > On Tue, Jul 19, 2022 at 12:55:14AM -0500, Steve Chavez wrote: > > Taking your options into consideration, for me the correct behaviour should > > be: > > > > - The ALTER ROLE placeholder should always be stored with a PGC_USERSET > > GucContext. It's a placeholder anyway, so it should be the less restrictive > > one. If the user wants to define it as PGC_SUSET or other this should be > > done through a custom extension. > > - When an extension claims the placeholder, we should check the > > DefineCustomXXXVariable GucContext with PGC_USERSET. If there's a match, > > then the value gets applied, otherwise WARN or ERR. > > The role GUCs get applied at login time right? So at this point we can > > WARN or ERR about the defined role GUCs. > > > > What do you think? > > Hm. I would expect ALTER ROLE to store the PGC_SUSET context when executed > by a superuser or a role with privileges via pg_parameter_acl. Storing all > placeholder GUC settings as PGC_USERSET would make things more restrictive > than they are today. For example, it would no longer be possible to apply > any ALTER ROLE settings from superusers for placeholders that later become > custom GUCS. Currently placehoders are always created PGC_USERSET, thus non-superuser can set it. But if loaded module defines the custom variable as PGC_SUSET, the value set by the user is refused then the value from ALTER-ROLE-SET or otherwise the default value from DefineCustom*Variable is used. If the module defines it as PGC_USERSET, the last value is accepted. If a placehoders were created PGC_SUSET, non-superusers cannot set it on-session. But that behavior is not needed since loadable modules reject PGC_USERSET values as above. Returning to the topic, that operation can be allowed in PG15, having being granted by superuser using the GRANT SET ON PARMETER command. =# GRANT SET ON PARAMETER my.username TO r1; r1=> ALTER ROLE r1 SET my.username = 'hoge_user_x'; r1=> \c r1=> => show my.username; my.username - hoge_user_x (1 row) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: Memory leak fix in psql
On Wednesday, July 20, 2022 12:52 PM, Michael Paquier wrote: > What about the argument of upthread where we could use a goto in > functions where there are multiple pattern validation checks? Per se > v4 attached. Thanks for your kindly remind and modification. I checked v4 patch, it looks good but I think there can be some minor improvement. So I deleted some redundant braces around "goto error_return; ". Also added an error handle section in validateSQLNamePattern. Kindly to have a check at the attached v5 patch. Regards, Tang v5-0001-fix-the-memory-leak-in-psql-describe.patch Description: v5-0001-fix-the-memory-leak-in-psql-describe.patch